# Table Level Screen

A license is required to access the Data Profiler on the DNAnexus Platform. For more information, please contact DNAnexus Sales (via <sales@dnanexus.com>).

## **A Note on Data:**

The data used in this section of Academy documentation can be found here to download: <https://synthea.mitre.org/downloads>

The citation for this synthetic dataset is:

Walonoski J, Klaus S, Granger E, Hall D, Gregorowicz A, Neyarapally G, Watson A, Eastman J. Synthea™ Novel coronavirus (COVID-19) model and synthetic data set. Intelligence-Based Medicine. 2020 Nov;1:100007. <https://doi.org/10.1016/j.ibmed.2020.100007>

## Table Level Screen

The Table-level screen appears when the user selects one particular table in the Navigator.

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXc1hsimQCYrcigIfByUr-qHUoeAOmkrc5Fl3LkvyzWvE6mu23EWMwCtu4Y-IWvNVjjZ2StI2uTDDn_A1QyGOe5oRgWQzYzvg94b_Vl_LdHENSDQ6mBluLWaqCYcPPWyl-zv3h--yw?key=vVck95n2RDdzCExalSaTUKx1" alt=""><figcaption></figcaption></figure>

Table-level Screen of a table in Data Profiler

### Table Overview

![](/files/xANlxVb8q2bFrVAzFD93)

Overview details on the header of the Table-level screen

On the header of the Table-level screen, the user can find overall statistics on the selected table, that include:

* **Table size**: number of rows and columns of the table
* **Missing rate**: the rate of empty cells in the table
* **Duplicate rate**: the rate of duplication of an entire row in the table

### Composition of Column Types

![](/files/STznXqwRcUlTNwkZpvJM)

Pie chart of Column types on the header of the Table-level screen

The pie chart shows the composition of column types in the table. The size of each part of the pie is determined by the number of columns of that type. The user can also hover on the chart to get the count value.

## **Table-level charts**

<figure><img src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXdWvpvrRlinKO-qgbY4mTKPRrZQBOHkjmJNjWiFz0NCRdE2QXUYmfdMxjS_boLX9LZfd4AhG_0zS61j1JGWkUTBjkHcb8IxihkQ91z6g2D_qjFomS8C2B73u5tTaZeb-rjyesye?key=vVck95n2RDdzCExalSaTUKx1" alt=""><figcaption></figcaption></figure>

Table-level screen has a Controller section that configures the visualization in the Chart area

The main function of the Table-level Screen is the Chart Area, which is controlled by a Controller in the top right corner of the screen. There are 2 main types of visualizations: Completeness and Column Profiles.

### Completeness

Completeness is the default mode of the Table-level screen. It aims to provide an overview on the count/rate of non-null values in a table. Completeness has 2 options: **One-way view** and **Two-way view**

#### One-way View: Bar chart

![](/files/WWLWPDYJyaDCWYJDE4H3)

One-way view in Table-level screen

One-way view is a stacked bar chart that displays the percentage of missing values, non-duplicates, and duplicates for each column in the table. You can click on the Legend/Key to show or hide specific statistics on the chart. Hover over each column to view detailed statistics.

#### Two-way View: Heat map

<figure><img src="/files/fXW9QXhR711LfdDnwgiD" alt=""><figcaption></figcaption></figure>

Two-way view in Table-level screen

Two-way view is a heat map showing data completeness for all columns in the table. The Y-axis of the heatmap is the columns of the table. The X-axis of the heatmap is the unique values of the group-by column. The value of the heatmap shows how many entities (in the **Raw count** mode, or percentage in the **Percentage** mode) of the table have non-null values on the columns (y-axis) with respect to the value of the group-by column (x-axis). . The user can choose another column as the grouping factor. Each label in this **Group-by column** is a column in the heat map. Only categorical columns which have a maximum of 30 unique values will show up as the options.

<figure><img src="/files/5EfjdmrAIshne4kW6oFT" alt=""><figcaption></figcaption></figure>

The Controller of Two-way view

The numbers in the heat map can be configured in two ways:

* **Raw count** displays the exact number of values available in each column.
* **Percentage** shows the completeness statistic as a percentage. The completeness statistic ranges from 0 to 100, where 0 means the data is completely missing, and 100 indicates that the data is 100% complete.

**Two-way View: Heat map, cross-table analysis**

The user can also join the current table of another table using the **Join with table** options. By joining with another table, the user can use a column from that table as the **Group-by column**.\\

**FAQs**

**Question**: Can I use the Two-way View to check how many female patients have sequencing data?

**Answer**: Yes. Assuming that your question involves 2 metadata: patient\_sex (from the patient table) and sequencing\_run\_id (from the sequencing table). The patient and sequencing table are join-able by patient\_id. If that is the case, you can open the patient table with the Two-way View; join it with the sequencing table; and choose patient\_sex as the Group-by column. On the sequencing.sequencing\_run\_id, you can see the completeness rate broken down by each sex in patient\_sex.

<figure><img src="/files/0gQvQODScsy9gSUam3wo" alt=""><figcaption></figcaption></figure>

The heatmap options controller when doing cross-table analysis. We are joining "patients" table into the "observations" table

<figure><img src="/files/4Jaq0dpOkj30ase2lx3H" alt=""><figcaption></figcaption></figure>

Completeness heatmap in case of cross-table analysis. In this example, the main table is "patients", the joined table is "observations". This heatmap shows how many patients who have available data (not-null values) on the fields which respect to the patient race: white, black, asian, native, or other&#x20;

#### Column Profiles

<figure><img src="/files/eNRKNyXzj15AKLCQEsVS" alt=""><figcaption></figcaption></figure>

Column Profiles mode shows each column as a tile. The chart type depends on the type of the column.

This screen provides detailed statistics and distribution charts for the columns in the table. For all column types, it displays the **missing rate** and the **duplication rate**.

For columns containing string data, it shows the **number of unique values** and the **value frequency**, which is represented in a distribution chart.

For columns containing float data, the screen provides information about the **variance**, **standard deviation**, and the **value range frequency**, which is displayed in a distribution chart. Additionally, a box plot is shown, illustrating the **maximum value**, **Q3 (upper quartile)**, **median**, **Q1 (lower quartile)**, and the **minimum value**.

For columns containing datetime data, the screen displays the **variance**, **standard deviation**, and **value range frequency** on a distribution chart. A box plot is also provided, showing the **maximum value**, **Q3 (upper quartile)**, **median**, **Q1 (lower quartile)**, and the **minimum value**.

### Resources

[Full Documentation](https://documentation.dnanexus.com/)

To create a support ticket if there are technical issues:

1. Go to the Help header (same section where Projects and Tools are) inside the platform
2. Select "Contact Support"
3. Fill in the Subject and Message to submit a support ticket.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://academy.dnanexus.com/mlaccelerator/dataprofiler/tablelevelscreen.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
