Skip to main content

Big Y - Data Downloads

 

After you have received results for a Big Y test, your data will be available to download in a variety of formats. All of these are accessible from your Big Y Results & Matches pages.

NOTE: 2FA must be enabled to download your data or matches. 

The options to download your data will be available as blue buttons near the top right corner of the screen.

Main Results landing page w lines.png

Data Downloads come in a variety of formats, and each has a different purpose. If you are unfamiliar with the file types or the data they contain, take a look at the Types of Data Files section below. 

The options for downloads are:

Big Y Results download data dropdown w numbers.png

  1. Download Named Variants (CSV) - Downloads a CSV file with the following information:
    • SNP Name -Lists all named SNPs for which you are positive.
    • Position - The location of the variant on the Y chromosome. This position corresponds to the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Ancestral - Displays the nucleotide base indicated by the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Derived - Displays the current DNA nucleotide you have at that position.
  2. Download Private Variants (CSV) - Downloads a CSV file with the following information:
    • Position - The location of the variant on the Y chromosome. This position corresponds to the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Ancestral - Displays the nucleotide base indicated by the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Derived - Displays the current DNA nucleotide you have at that position.
  3. Download Matches (CSV) - Downloads a CSV file with the following information:
    • Full_Name—The full name as listed by your matches. This is further separated into columns for First, Middle, and Last names to make searching and filtering easier. 
    • Match_Date - This indicates the date you and the match were identified. This may be the date you tested, they tested, or that either of you upgraded from the Big Y-500 test to the Big Y-700 test.
    • Haplogroup - This lists the furthest downstream subclade or haplogroup of your match.
    • Non_Matching_Known_SNPs - This includes both known SNPs and novel variants that you do not share with your match.
    • Y37, Y67, and Y111 Genetic Distance - If your Big Y match is also an STR match at any of these levels, it will list your Genetic Distance with them. Otherwise it will list “X."
    • Big_Y_STR_Difference - This lists the number of mismatches you have out of the total number of Big Y STRs you and a match both have. You can learn more about Big Y STRs here
      NOTE: This does not include STRs for which you have results but your match does not, or vice versa. 
    • Paternal_Ancestor - This lists any information about the Most Distant Known direct paternal ancestor that your match has provided.
    • Note- This lists any private notes you have added to you match.
    • Linked_Relationship - Indicates if you have linked this match to your family tree.
  4. Download VCF -Contains a list of all variants found.
     
  5. Buy Raw Data -Takes you to the shopping cart to purchase your Big Y raw data in a BAM file format.

 

Types of Data Files

CSV

A Comma Separated Value file (CSV file) is designed to be read by spreadsheet applications such as Google Sheets or Microsoft Excel. It is a common and versatile file format that makes it easy to access and sort your variants quickly.

VCF

A Variant Call Format file (VCF file) is a zipped archive that contains a VCF and a BED file. Important things to note about Big Y VCF files:

  • We include variants that did not pass our analysis standards for your informational purposes only.
  • The download will be a zipped archive that contains a VCF and a BED file.
  • The BED file contains the regions that were targeted in Big Y and passed our sequencing and analysis quality control.
  • You can find more info about the VCF file format here.
  • You can find more info about the BED file format here

BED

Browser Extensible Data file (BED file) is a text-based file format used to store the precise location of genomic regions and their associated annotations. It is a highly versatile and widely used format in bioinformatics. It allows researchers to define and visualize specific features on a chromosome,

BAM 

A Binary Alignment Map (BAM file) is a compressed, binary file format that represents the raw data of a DNA test. It contains the alignment of your sequenced DNA fragments, known as reads, to a reference genome. The BAM file includes essential information such as the location of each read on a chromosome, the sequence of the read, and a quality score for each base. It is a highly efficient format used by bioinformatics software for advanced genetic analysis.

Your Big Y data can be compiled into a BAM file. This entails an additional internal cost to convert, store, and provide this large file format. Originally this price was included in the cost of the Big Y test and was available to download when results were complete. However, only a small portion of Big Y participants request this download. Rather than continue to pass this cost along to all Big Y participants, we have chosen to only charge those who wish to download their BAM file for that data. You can read more about BAM files here.

Submit Feedback