Hi, Sorry to disturb:
I am trying to understand the training strategy of HIV dataset and replicate the results you get in your publication.
It seems that the dataset can be categorized as non-cognate groups (CEF, AY9, No Peptide conditions), or cognate groups (where there is an epitope). We have 3 * 3 samples that are non-cognate, while 25 * 3 samples as cognate groups. I saw from the paper that deeptcr can distinguish non-cognate samples from cognate samples, and the training used keep two out of three for training data.
My question is, when doing the training, did you
- fit the model using all (3+25) * 2 data at once, where 3 * 2 are non-cognate and 25*2 are cognate group? Then you test the model on the remaining 3+25 samples and see whether the model can correctly predict whether each sample is cognate or non-cognate.
- Or you use (3+1) * 2 data, where the 3 * 2 data are non-cognate while the 1 * 2 data is from one specific epitope instead using all 25 * 2 samples as cognate group data? Then you test the model on the remaining 3+1 samples to see whether it can corrected predict which (one) sample is the cognate group.
Then you repeat 2 for each specific epitope (MSPRTLNAW, NTQGYFPDW, etc...)
Thanks and looking forward to your reply!
Hi, Sorry to disturb:
I am trying to understand the training strategy of HIV dataset and replicate the results you get in your publication.
It seems that the dataset can be categorized as non-cognate groups (CEF, AY9, No Peptide conditions), or cognate groups (where there is an epitope). We have 3 * 3 samples that are non-cognate, while 25 * 3 samples as cognate groups. I saw from the paper that deeptcr can distinguish non-cognate samples from cognate samples, and the training used keep two out of three for training data.
My question is, when doing the training, did you
Then you repeat 2 for each specific epitope (MSPRTLNAW, NTQGYFPDW, etc...)
Thanks and looking forward to your reply!