Accessing Data Profiler in ML JupyterLab
If you also have access to the ML JupyterLab (another solution in the AI/ML Accelerator Package), Data Profiler can be seamlessly opened in the JupyterLab environment, offering an intuitive and interactive tool for profiling multiple datasets directly within one workspace.
To get started, simply open an ML JupyterLab notebook, load the dataset, and profile it.
Profiling the Dataset
The integrated version of Data Profiler in ML JupyterLab (dxprofiler) offers four methods for loading your datasets to profile the data:
Loading the dataset by specifying a path to the local folder (in the ML JupyterLab job) which contains the .csv or .parquet files.
Loading the dataset by a list of .csv or .parquet files.
Loading the dataset by Pandas dataframes ('patient_df' and 'clinical_df')
Loading the dataset by a record object (DNAnexus Dataset or Cohort). "project-xxxx:record-yyyy" is the ID of your Apollo Dataset (or Cohort) on the DNAnexus platform.
Open the Data Profiler GUI
Once you finish profiling the dataset, here is the command to open the Data Profiler GUI:
Resources
To create a support ticket if there are technical issues:
Go to the Help header (same section where Projects and Tools are) inside the platform
Select "Contact Support"
Fill in the Subject and Message to submit a support ticket.
Last updated
Was this helpful?