If you also have access to the ML JupyterLab (another solution in the AI/ML Accelerator Package), Data Profiler can be seamlessly opened in the JupyterLab environment, offering an intuitive and interactive tool for profiling multiple datasets directly within one workspace.
To get started, simply open an ML JupyterLab notebook, load the dataset, and profile it.
Profiling the Dataset
The integrated version of Data Profiler in ML JupyterLab (dxprofiler) offers four methods for loading your datasets to profile the data:
Loading the dataset by specifying a path to the local folder (in the ML JupyterLab job) which contains the .csv or .parquet files.
Loading the dataset by a record object (DNAnexus Dataset or Cohort). "project-xxxx:record-yyyy" is the ID of your Apollo Dataset (or Cohort) on the DNAnexus platform.