Skip to content

Question about saturation check constants (1.99 vs 2.0) #140

Description

@blueWatermelonFri

Hello,

First, I've been studying the source code to better understand the implementation of the sieving algorithms, and I have a quick question about a specific design choice.

I noticed that there are two slightly different constants used when calculating and checking for saturation.

In db_stats (within siever.pyx), the theoretical saturation metric appears to be calculated using the factor 2.0:

return [(2 * tmp_histo[i] / (1+i*(1./self._core.size_of_histo))**(self.n/2.)) for i in range(self._core.size_of_histo)]

However, in the termination condition within Siever::gauss_sieve (in sieveing.cpp), a slightly smaller constant, 1.99, is used for the check:

        for (unsigned int i=0 ; i < size_of_histo; ++i)
        {
            cumul += histo[i];
            if (i>=imin && 1.99 * cumul > GBL_saturation_histo_bound[i])
            {
                return;
            }
        }

My hypothesis is that this is an intentional design choice to separate the theoretical calculation from the practical decision-making:

The 2.0 is used for pure statistical reporting, providing a result that is true to the underlying theoretical model (Gaussian Heuristic).

The 1.99 is used in the control-flow logic to provide a small margin of error, making the termination condition more robust against floating-point inaccuracies and ensuring the algorithm doesn't get stuck trying to reach a perfect, idealized value.

Could you confirm if this understanding is correct? Or is there a more subtle reason for this specific choice of 1.99?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions