Microsoft announces the public availability of two data science utilities

Reading time icon 1 min. read


Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

Data scientists spend a significant amount of time writing code seeking answers to below questions most of the time.

  • What does the data look like? What’s the schema?
  • What’s the quality of the data? What’s the severity of missing data?
  • How are individual variables distributed? Do I need to do variable transformation?
  • How relevant is the data is to the machine learning task? How difficult is the machine learning task itself?
  • Which variables are most relevant to the machine learning target?
  • Is there any specific clustering pattern in the data?
  • How will ML models on the data perform? Which variables are significant in the models?

Much of the code can be generalized into data science utilities that can be reused across projects helping data scientists work on specific tasks in a project in a guided mode, ensuring consistency and completeness of the underlying tasks. To help data scientists, Microsoft is releasing two data science utilities,

  1. Interactive Data Exploration, Analysis and Reporting (IDEAR), and
  2. Automated Modeling and Reporting (AMAR).

These two utilities, which run in CRAN-R, can be accessed from this GitHub site.

Read more about these utilities here.

User forum

0 messages