CREST [Society 5.0 System Software]

Anonymous healthcare cohort with the guarantee of data privacy and utility

Research Summary

Healthcare data ecosystem and four challenges
  • 1 Consent of the Treatment of Personal Data

    Insufficient understanding of privacy notice

  • 2 Business operator selection

    No personal data for testing,
    The quality of processing technology cannot be guaranteed

    The need for a fair benchmark

  • 3 Security

    99.98% of personal attributes
    Americans are reidentifiable

    [Rocher 2019] “Estimating the success of reidentifications in incomplete datasets using generative models”, Nature Comm., 2019.

  • 4 Utility

    Example of unnatural processing (age distribution)

    National Health and Nutrition Examination Survetion Survey CDC (Centers for Disease Control and Prevention) National Health and Nutrition Survey Program

Achievement Goals (Delivery) Healthcare Anonymous Cohort Infrastructure

Groups Collaboration

Research implementation system

Data name

Number of individuals

Details

Number of records

Number of attributes

Attribute exampe

Number of receipts

Health diagnosis data

198,740

Health checkup Recorded

964,636

49

Height, weight
Health distribution

-

Health insurance claims data

288,568

The patient was diagnosed Injuries and illnesses have been recorded

39,363,878

15

Injury and illness classification Code

11,912,236

Pharmaceutical claims data

279,199

Patient was prescribed Drugs are recorded

31,465,504

21

Pharmaceuticals Classification code

9,000,249

Analysis 2. Prediction of morbidity by machine learning

Our goals

  • To reduce various risks in anonymization
    safety Data

  • To make data as practical as used for medical data usefulness