Skip to main content

Big Y - Data Downloads

Big Y Data Downloads

After you have received results for a Big Y test, your data will be available to download in a variety of formats. All of these are accessible from your Big Y Results & Matches pages.

The options to download your data will be available as blue buttons near the top right corner of the screen.

Main Results landing page w lines.png

Data Downloads come in a variety of formats, and each has a different purpose. If you are unfamiliar with the file types or the data they contain, take a look at the Types of Data Files section below. 

The options for downloads are:

Big Y Results download data dropdown w numbers.png

  1. Download Named Variants (CSV) - Downloads a CSV file with the following information:
    • SNP Name -Lists all named SNPs for which you are positive.
    • Position - The location of the variant on the Y chromosome. This position corresponds to the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Ancestral - Displays the nucleotide base indicated by the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Derived - Displays the current DNA nucleotide you have at that position.
  2. Download Private Variants (CSV) - Downloads a CSV file with the following information:
    • Position - The location of the variant on the Y chromosome. This position corresponds to the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Ancestral - Displays the nucleotide base indicated by the GRCh38 human reference genome, which is maintained by the Genome Reference Consortium.
    • Derived - Displays the current DNA nucleotide you have at that position.
  3. Download Matches (CSV) - Downloads a CSV file with the following information:
    • Full_Name—The full name as listed by your matches. This is further separated into columns for First, Middle, and Last names to make searching and filtering easier. 
    • Match_Date - This indicates the date you and the match were identified. This may be the date you tested, they tested, or that either of you upgraded from the Big Y-500 test to the Big Y-700 test.
    • Haplogroup - This lists the furthest downstream subclade or haplogroup of your match.
    • Non_Matching_Known_SNPs - This includes both known SNPs and novel variants that you do not share with your match.
    • Y37, Y67, and Y111 Genetic Distance - If your Big Y match is also an STR match at any of these levels, it will list your Genetic Distance with them. Otherwise it will list “X."
    • Big_Y_STR_Difference - This lists the number of mismatches you have out of the total number of Big Y STRs you and a match both have. You can learn more about Big Y STRs here
      NOTE: This does not include STRs for which you have results but your match does not, or vice versa. 
    • Paternal_Ancestor - This lists any information about the Most Distant Known direct paternal ancestor that your match has provided.
    • Note- This lists any private notes you have added to you match.
    • Linked_Relationship - Indicates if you have linked this match to your family tree.
  4. Download VCF -Contains a list of all variants found.
     
  5. Buy Raw Data -Takes you to the shopping cart to purchase your Big Y raw data in a BAM file format.

 

Types of Data Files

CSV

A Comma Separated Value file (CSV file) is designed to be read by spreadsheet applications such as Google Sheets or Microsoft Excel. It is a common and versatile file format that makes it easy to access and sort your variants quickly.

VCF

A Variant Call Format file (VCF file) is a zipped archive that contains a VCF and a BED file. Important things to note about Big Y VCF files:

  • We include variants that did not pass our analysis standards for your informational purposes only.
  • The download will be a zipped archive that contains a VCF and a BED file.
  • The BED file contains the regions that were targeted in Big Y and passed our sequencing and analysis quality control.
  • You can find more info about the VCF file format here.
  • You can find more info about the BED file format here

BED

Browser Extensible Data file (BED file) is a text-based file format used to store the precise location of genomic regions and their associated annotations. It is a highly versatile and widely used format in bioinformatics. It allows researchers to define and visualize specific features on a chromosome,

BAM 

A Binary Alignment Map (BAM file) is a compressed, binary file format that represents the raw data of a DNA test. It contains the alignment of your sequenced DNA fragments, known as reads, to a reference genome. The BAM file includes essential information such as the location of each read on a chromosome, the sequence of the read, and a quality score for each base. It is a highly efficient format used by bioinformatics software for advanced genetic analysis.

Your Big Y data can be compiled into a BAM file. This entails an additional internal cost to convert, store, and provide this large file format. Originally this price was included in the cost of the Big Y test and was available to download when results were complete. However, only a small portion of Big Y participants request this download. Rather than continue to pass this cost along to all Big Y participants, we have chosen to only charge those who wish to download their BAM file for that data. You can read more about BAM files here.

Submit Feedback