Big Y Data Downloads
After you have received results for a Big Y test, your data will be available to download in a variety of formats. All of these are accessible from your Big Y Results & Matches pages.
The options to download your data will be available as blue buttons near the top right corner of the screen.
Data Downloads come in a variety of formats, and each has a different purpose. If you are unfamiliar with the file types or the data they contain, take a look at the Types of Data Files section below.
The options for downloads are:
-
Download Named Variants (CSV) - Downloads a CSV file with the following information:
- SNP Name -Lists all named SNPs for which you are positive.
- Position - The location of the variant on the Y chromosome. This position corresponds to the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
- Ancestral - Displays the nucleotide base indicated by the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
- Derived - Displays the current DNA nucleotide you have at that position.
-
Download Private Variants (CSV) - Downloads a CSV file with the following information:
- Position - The location of the variant on the Y chromosome. This position corresponds to the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
- Ancestral - Displays the nucleotide base indicated by the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
- Derived - Displays the current DNA nucleotide you have at that position.
-
Download Matches (CSV) - Downloads a CSV file with the following information:
- Full_Name—The full name as listed by your matches. This is further separated into columns for First, Middle, and Last names to make searching and filtering easier.
- Match_Date - This indicates the date you and the match were identified. This may be the date you tested, they tested, or that either of you upgraded from the Big Y-500 test to the Big Y-700 test.
- Haplogroup - This lists the furthest downstream subclade or haplogroup of your match.
- Non_Matching_Known_SNPs - This includes both known SNPs and novel variants that you do not share with your match.
- Y37, Y67, and Y111 Genetic Distance - If your Big Y match is also an STR match at any of these levels, it will list your Genetic Distance with them. Otherwise it will list “X."
-
Big_Y_STR_Difference - This lists the number of mismatches you have out of the total number of Big Y STRs you and a match both have. You can learn more about Big Y STRs here.
NOTE: This does not include STRs for which you have results but your match does not, or vice versa. - Paternal_Ancestor - This lists any information about the Most Distant Known direct paternal ancestor that your match has provided.
- Note- This lists any private notes you have added to you match.
- Linked_Relationship - Indicates if you have linked this match to your family tree.
-
Download VCF -Contains a list of all variants found.
- Buy Raw Data -Takes you to the shopping cart to purchase your Big Y raw data in a BAM file format.
Types of Data Files
CSV
A Comma Separated Value file (CSV file) is designed to be read by spreadsheet applications such as Google Sheets or Microsoft Excel. It is a common and versatile file format that makes it easy to access and sort your variants quickly.
VCF
A Variant Call Format file (VCF file) is a zipped archive that contains a VCF and a BED file. Important things to note about Big Y VCF files:
- We include variants that did not pass our analysis standards for your informational purposes only.
- The download will be a zipped archive that contains a VCF and a BED file.
- The BED file contains the regions that were targeted in Big Y and passed our sequencing and analysis quality control.
- You can find more info about the VCF file format here.
- You can find more info about the BED file format here.
BED
Browser Extensible Data file (BED file) is a text-based file format used to store the precise location of genomic regions and their associated annotations. It is a highly versatile and widely used format in bioinformatics. It allows researchers to define and visualize specific features on a chromosome,
BAM
A Binary Alignment Map (BAM file) is a compressed, binary file format that represents the raw data of a DNA test. It contains the alignment of your sequenced DNA fragments, known as reads, to a reference genome. The BAM file includes essential information such as the location of each read on a chromosome, the sequence of the read, and a quality score for each base. It is a highly efficient format used by bioinformatics software for advanced genetic analysis.
Your Big Y data can be compiled into a BAM file. This entails an additional internal cost to convert, store, and provide this large file format. Originally this price was included in the cost of the Big Y test and was available to download when results were complete. However, only a small portion of Big Y participants request this download. Rather than continue to pass this cost along to all Big Y participants, we have chosen to only charge those who wish to download their BAM file for that data. You can read more about BAM files here.