Skip to content
This repository was archived by the owner on Sep 17, 2024. It is now read-only.
This repository was archived by the owner on Sep 17, 2024. It is now read-only.

tpcc: consistency checks are very slow and provide no progress indication #147

Description

@petermattis

checkConsistency() is currently prohibitively slow for large numbers of warehouses and provides no progress indication. The most egregious check is for W_YTD = sum(H_AMOUNT) for each warehouse:

	var sumHAmount float64
	for i := 0; i < *warehouses; i++ {
		if err := db.QueryRow("SELECT w_ytd FROM warehouse WHERE w_id=$1", i).Scan(&wYTD); err != nil {
			return err
		}
		if err := db.QueryRow("SELECT SUM(h_amount) FROM history WHERE h_w_id=$1", i).Scan(&sumHAmount); err != nil {
			return err
		}
		if wYTD != sumHAmount {
			fmt.Printf("check failed: w_ytd=%f != sum(h_amount)=%f for warehouse %d\n", wYTD, sumHAmount, i)
		}
	}

The main problem here is that the SELECT SUM(h_amount) FROM history WHERE h_w_id=$1 has to do a lookup join when scanning the history_h_w_id_h_d_id_idx in order to retrieve h_amount. We could add a STORING clause to that index. Alternately, we could perform a single group by query: SELECT h_w_id, SUM(h_amount) FROM history GROUP BY h_w_id. Not sure which would be faster. Adding the STORING clause would certainly be easier.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions