Possible inversion of LOCO logic in build_k_matrix function

Hi Dr. Endelman,
I’ve been benchmarking the kinship-calculating section of the script on GitHub and noticed a small but critical inversion in the way the LOCO (Leave-One-Chromosome-Out) matrices are built.
Current code (set.K):
```R
f1 <- function(marks, data) {
      M <- scale(data@geno[, marks], center = T, scale = F)
      K <- tcrossprod(M)
      K <- K/mean(diag(K))

markers <- split(data@map$Marker, data@map$Chrom)   # per-chr SNPs
K <- lapply(markers, f1, data = data)               # f1 uses those SNPs
```
This produces “chromosome-specific” kinship matrices, i.e. each matrix is built from the SNPs on the same chromosome that will later be tested.  In the standard LOCO implementation, we do the opposite: when we scan chromosome i we estimate kinship from all SNPs except those on chromosome i, so that the random-effect term does not absorb the very signal we are trying to detect.

```R
allSNPidx <- seq_len(ncol(data@geno))
markers   <- split(data@map$Marker, data@map$Chrom)
K <- lapply(names(markers), function(chr) {
  out   <- which(data@map$Chrom == chr)  
  M     <- scale(data@geno[, -out], center = T, scale = F)
  tcrossprod(M) / mean(diag(tcrossprod(M)))
})
```

That would make the resulting list K[[i]] correspond to the kinship matrix used when scanning chromosome i, consistent with the usual LOCO definition.

Thanks very much for maintaining the code—it's been incredibly useful.

Best regards,
Zhiyuan
China


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible inversion of LOCO logic in build_k_matrix function #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Possible inversion of LOCO logic in build_k_matrix function #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions