Skip to content

Change batch/parallel Ranged Ingredient rolls from uniform random to gaussian random#4212

Open
DilithiumThoride wants to merge 7 commits into
1.20.1from
dt/uniform-ranged
Open

Change batch/parallel Ranged Ingredient rolls from uniform random to gaussian random#4212
DilithiumThoride wants to merge 7 commits into
1.20.1from
dt/uniform-ranged

Conversation

@DilithiumThoride
Copy link
Copy Markdown
Contributor

@DilithiumThoride DilithiumThoride commented Nov 21, 2025

What

When Ranged Ingredients are run with Batch-Parallel counts, the current implementation simply but naively just multiplies the minimum and maximum bounds by the Batch-Parallel count. While this is fast and simple it means that Batch-Parallel Ranged Ingredient runs have a huge amount of variability and inconsistency in their input/output values, relative to running those same recipes without batching or parallels.

This PR alters the Batch-Parallel roll behavior to pull from a Gaussian Normal roll rather than a uniform random roll. This means that, especially at high Batch-Parallel counts, Ranged Ingredients will behave in a way that's spiritually more similar to the Guaranteed Rolls mechanic present in Chanced Ingredients, rather than being pure uniform random. (And still without running all of those individual rolls.

Implementation Details

Mojang has give us MarsagliaPolarGaussian and ClampedNormalInt provider classes to work with, meaning all that needs to be done by us is pipe in an existing IntProvider and hand them the modified limit bounds, mean, and standard deviation. However, because I don't know how basically anything in statistics works, I had to call several people (@krossgg ; @Storijophe ; @Zoryn4163 ) for help explaining the math.

AI Usage

  • [ X ] No AI driven tools were used for this pull request.

Outcome

Batch-Parallel recipe runs on Ranged Ingredients will produce input/output amounts that are much more similar to if they had been run individually-consecutively that number of times.

How Was This Tested

Automated test suite and a lot of debug breakpoints to get roll values and see that they're actually looking Normal and not excessive or bounds-breaking. Most testing was done against #4884 ; but this PR is not blocked by that one.

Potential Compatibility Issues

IntProvider type replacement only applies to the default case of UniformInt (or another ClampedNormalInt) so any other provider types will be unaffected.

@DilithiumThoride DilithiumThoride added type: refactor Suggestion to refactor a section of code 1.20.1 Release: Major - 0.X.0 Releases focused on Content, changes to gameplay; While maintaining mostly API stability. labels Nov 21, 2025
@github-actions github-actions Bot added the Tests: Passed Game Tests have passed on this PR label Nov 21, 2025
@krossgg
Copy link
Copy Markdown
Contributor

krossgg commented Nov 22, 2025

You could just call it a NormalInt, since you're trying to adhere to a scaled normal distribution. The CLT is just a way of justifying the fact that the sample average distribution for a population of independent variables will tend towards a normal distribution as the sample size becomes larger. Skip to the last paragraph for TL;DR.


In our case, we are essentially rolling $n=$parallel identical dice (discrete uniform distribution) that have faces from $a=$minInclusive to $b=$maxInclusive and summing up the results of each die.

The mean $\mu$ and variance $\sigma^2$ of one die is

$$\mu=\frac{a+b}{2} \text{ and } \sigma^2 = \frac{(b-a+1)^2 - 1}{12}$$

(Taken from Wikipedia, but this can be confirmed with the formulas for expected value and variance)

Then, when rolling $n$ dice and summing the values, the mean and variance change linearly with $n$, giving us

$$\mu_n=n\mu=n\frac{a+b}{2} \text{ and } \sigma_n^2 = n\sigma^2 = n\frac{(b-a+1)^2 - 1}{12}$$

If we were to continue doing these rolls of $n$ dice and observing the distribution of the sums, it would look more and more like a bell curve as $n$ increased. That is to say, the sums would tend towards being normally distributed with a mean of $\mu_n$ and variance of $\sigma_n^2$, or $\sum_{i=1}^n X_i \sim N(n\mu, n\sigma^2)$

If you want to learn more (and with visuals), 3b1b has a great video on the topic.


Simply put, you just need a sample from a normal distribution that has the mean and variance that we want. This is fairly easy because the normal distribution is defined very conveniently. You can just take a random sample from the standard normal distribution (via a call to nextGaussian()), then scale it by $\sigma$ and shift it by $\mu$.

float mean = parallel * (min + max) / 2f;
int s = max - min + 1;
float sd = Math.sqrt(parallel * (s * s - 1) / 12f);
return random.nextGaussian() * sd + mean;

Also I just looked and Mojang already has a ValueProvider that does this - ClampedNormalInt xdd

@DilithiumThoride
Copy link
Copy Markdown
Contributor Author

That makes this all so much easier to both understand and implement. Thank you.

@github-actions github-actions Bot added Tests: Failed Game Tests have failed on this PR and removed Tests: Passed Game Tests have passed on this PR labels Nov 22, 2025
@krossgg
Copy link
Copy Markdown
Contributor

krossgg commented Nov 22, 2025

I thought about it again and there is a small inconsistency in the math here. Our ContentModifier class supports both multiplication and addition. I don't think anyone uses it (could be wrong), but regardless it should work properly.
In my opinion, the parallel local (you should rename it to like n or something, parallel is kind of a misnomer, since it would work for any multiplier), should depend specifically on the multiplier part of the modifier. The min and the max are still affected by both the multiplier and the adder, since that just changes the range of the roll.

Glad to have helped you learn though :^)

@github-actions github-actions Bot added Tests: Passed Game Tests have passed on this PR and removed Tests: Failed Game Tests have failed on this PR labels Jun 3, 2026
@Zoryn4163
Copy link
Copy Markdown
Contributor

Forgive my sudden intrusion but this topic was brought to me and I wanted to contribute my understanding:

The initial samples would follow a uniform distribution. If you were to perform some Y number of rolls (the distribution would be approximately normal at Y=1000) of n samples in the range [a,b], sum them, sort them, and then take a given quantile from them, you could achieve a likely range of observable outcomes. You could create a cache for this (rolls, min, max, pcnt) -> (lower, upper) to dramatically reduce the number of times this is simulated (since X~N at a high Y the values of a given n, min, max, pcnt would remain roughly consistent). This function could also calculate the mu and sigma of the samples and use those to produce ranges (see my LaTeX below) but I deemed it unnecessary for the Java to go to such lengths.

You could then leverage this to, for example, have weighted rolls of some predefined topXPcnt (68%, 95%, and 99.5% are common for 1, 2, and 3 standard deviations) or just outright roll a nextInt() on the lower and upper ranges

Sample Java of a potential approach:

    // call it with e.g. for 64 rolls of range [0, 40), find the top 99.5% of values in the range
    var calc = calcMinMaxN(64, 0, 40, 0.995);
    // output sample: {"n":64,"min":0.0,"max":40.0,"lower":965.0,"upper":1520.0}

    ///////////////////////////////////////

    // as n approaches 1000, t-distributions approach normal
    static final int NUM_SAMPLES_NORMAL = 1000;
    static final Random random = new Random();

    static normalInterval calcMinMaxN(int n, int min, int max, double topXPercent) {
        // insert some cache lookup here to early return instead of performing heavy computations if we've already found the (lower, upper) for a given (n, min, max, pcnt)
        int[] samples = new int[NUM_SAMPLES_NORMAL];
        for (int i = 0; i < samples.length; i++) {
            int sum = 0;
            for (int j = 0; j < n; j++) {
                sum += random.nextInt(min, max);
            }
            samples[i] = sum;
        }

        Arrays.sort(samples);

        topXPercent = topXPercent + ((1 - topXPercent)/2);

        double lower = samples[(int)((1 - topXPercent) * samples.length)];
        double upper = samples[(int)(topXPercent * samples.length)];

        // cache the values for faster retrieval later (don't sim the same n, min, max, pcnt more than once)
        //addCacheValue(n, min, max, lower, upper);
        return new normalInterval(n, min, max, lower, upper);
    }

    /////////////////////

      class normalInterval {
          public int n;
          public double min;
          public double max;
          public double lower;
          public double upper;
      
          public normalInterval(int n, double min, double max, double lower, double upper) {
              this.n = n;
              this.min = min;
              this.max = max;
              this.lower = lower;
              this.upper = upper;
          }
      }

I also wrote an approach in R:

n <- 64
range <- c(0, 40)

samples <- numeric(1000)
for (i in 1:1000) {
  samples[i] <- sum(sample(range[1]:range[2], n, replace = TRUE))
}

mu <- mean(samples)
sigma <- sd(samples)

lower <- quantile(samples, 0.005)
upper <- quantile(samples, 0.995)

result <- c(lower, upper)

And hand-calculated some values that match up with the generated range:

$$ \begin{gathered} \mu = 1279.1832 \hspace{1cm} \sigma = 95.4363 \\ TopPcnt=0.995 \hspace{1cm} z = 2.575 \\ x_l = \mu - z\sigma \hspace{1cm} x_u = \mu + z\sigma \\ x_l = 1279.1832 - (2.575 * 95.4363) \hspace{1cm} x_u = 1279.1832 + (2.575 * 95.4363) \\ x_l = 1279.1832 - 245.7484 \hspace{1cm} x_u = 1279.1832 + 245.7484 \\ x_l = 1033.4348 \hspace{1cm} x_u = 1524.9316 \\ RANGE = (1033.4348, 1524.9316) \end{gathered} $$

Finally, looking at a histogram of the samples generated in R shows a clear approximately-normal distribution:
image

Disclaimer: I am a uni student and by no means an expert. This is based on my current level of understanding of these topics.

@Zoryn4163
Copy link
Copy Markdown
Contributor

I way overthought my prior statement. If n is sufficiently large it's trivial to just pluck a value out of the normal distribution given a starting uniform distribution:

static double randomNormal(int n, int min, int max) {
        double mean = n * ((min + max) / 2.0);
        double diff = max - min;
        double sd = Math.sqrt(n * Math.pow(diff,  2) / 12.0);
        double unbounded = (mean + sd * random.nextGaussian());
        return Math.max(n * min, Math.min(n * max, unbounded));
    }

Technically works for any n but the variance at lower n will be significant.
Depending on the range of n needed this may be sufficient.

@DilithiumThoride
Copy link
Copy Markdown
Contributor Author

DilithiumThoride commented Jun 6, 2026

I way overthought my prior statement. If n is sufficiently large it's trivial to just pluck a value out of the normal distribution given a starting uniform distribution:

static double randomNormal(int n, int min, int max) {
        double mean = n * ((min + max) / 2.0);
        double diff = max - min;
        double sd = Math.sqrt(n * Math.pow(diff,  2) / 12.0);
        double unbounded = (mean + sd * random.nextGaussian());
        return Math.max(n * min, Math.min(n * max, unbounded));
    }

Technically works for any n but the variance at lower n will be significant. Depending on the range of n needed this may be sufficient.

I'm going to implement this solution. It's almost the same as Kross's solution, I just more fully understand the explanation and justification this time now that it's seven months in the future.
Thank you.

@DilithiumThoride DilithiumThoride marked this pull request as ready for review June 6, 2026 23:57
@DilithiumThoride DilithiumThoride requested a review from a team as a code owner June 6, 2026 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1.20.1 Release: Major - 0.X.0 Releases focused on Content, changes to gameplay; While maintaining mostly API stability. Tests: Passed Game Tests have passed on this PR type: refactor Suggestion to refactor a section of code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants