Loading…

Extraction Quality

Required field extraction rates and confidence scores by document type
Reset
Accuracy Summary by Date & Document Type Confidence threshold: 60%
Date Document Type Total Pages Avg Extraction % Avg Confidence % Pass Rate (≥60%) vs Baseline
2026-06-19 Consent 1 35.3% 32.4% 0.0% ⇩ Below baseline
2026-06-19 Face sheet 13 29.3% 27.0% 0.0% ⇩ Below baseline
2026-06-19 FAX-COVER PAGE 6 15.3% 20.0% 0.0% ⇩ Below baseline
2026-06-19 ID/Insurance Card 1 46.2% 41.2% 0.0% ⇩ Below baseline
2026-06-19 Medication List 2 3.8% 3.5% 0.0% ⇩ Below baseline
2026-06-19 ORDER 5 24.6% 20.3% 0.0% ⇩ Below baseline
2026-06-19 Progress Note 3 7.7% 6.9% 0.0% ⇩ Below baseline
2026-06-19 Referral Order 6 61.7% 48.3% 0.0% ⇩ Below baseline
2026-06-19 Unclassified 3 0.0% 0.0% 0.0% ⇩ Below baseline
Document-Level Detail Export Click Extracted % header to sort ↑↓
Consent 1 pages
Document ID Pg # Received At Extracted % Avg Conf patient_sign… facility_name resident_name facility_add… resident_fir… patient_sign… patient_repr… facility_city resident_las… relationship… date_of_birth facility_state facility_pho… gender patient_uniq… facility_alert mrn
31cef347… 2 35% 32% ✓ 75% ✓ 75% ✓ 100% ✓ 100% ✓ 100% ✓ 100%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (35%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (32% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
11 field(s) with ≥50% miss rate — Fields frequently not extracted: patient_signature (100% missing), facility_address (100% missing), patient_signed_date (100% missing), patient_representative (100% missing), facility_city (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
Face sheet 13 pages
Document ID Pg # Received At Extracted % Avg Conf diagnosis_in… vision_plan resident_name care_providers facility_name medicare_ben… part_d_polic… insurance_name payer_info insurance_po… medicaid_id contacts insurance_id insurance_id insurance_name dental_insur… facility_add… medicare_num… managed_medi… part_d_group group_name resident_fir… resident_las… part_d_plan_… plan_name dental_polic… group_number medicare_a_n… facility_city medicaid_num… insurance_po… long_term_ca… part_d_carrier facility_state medicare_b_n… insurance_po… date_of_birth group_name gender facility_zip insurance_name medicare_hic group_number long_term_ca… hmo_managed_… mrn plan_id facility_pho… facility_alert hmo_managed_… plan_name patient_uniq… va_policy room_number mco_identifier resident_num… hmo_managed_… ssn address city state zip
4ea33941… 3 3% 4% ✓ 100% ✓ 100%
bc961ab7… 2 16% 14% ✓ 86% ✓ 95% ✓ 86% ✓ 86% ✓ 86% ✓ 100% ✓ 29% ✓ 36% ✓ 59% ✓ 86%
bc961ab7… 3 16% 14% ✓ 86% ✓ 95% ✓ 86% ✓ 86% ✓ 86% ✓ 100% ✓ 29% ✓ 36% ✓ 59% ✓ 86%
4ea33941… 4 23% 21% ✓ 82% ✓ 85% ✓ 35% ✓ 35% ✓ 100% ✓ 100% ✓ 95% ✓ 100% ✓ 100% ✓ 100% ✓ 75% ✓ 75% ✓ 75% ✓ 75%
2c526417… 4 34% 35% ✓ 75% ✓ 95% ✓ 75% ✓ 75% ✓ 75% ✓ 95% ✓ 75% ✓ 75% ✓ 95% ✓ 95% ✓ 95% ✓ 100% ✓ 95% ✓ 95% ✓ 75% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 75% ✓ 95% ✓ 95% ✓ 95%
2c526417… 5 34% 35% ✓ 75% ✓ 95% ✓ 75% ✓ 75% ✓ 75% ✓ 95% ✓ 75% ✓ 75% ✓ 95% ✓ 95% ✓ 95% ✓ 100% ✓ 95% ✓ 95% ✓ 75% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 75% ✓ 95% ✓ 95% ✓ 95%
d4a3c62e… 4 34% 34% ✓ 75% ✓ 95% ✓ 75% ✓ 50% ✓ 50% ✓ 95% ✓ 75% ✓ 75% ✓ 95% ✓ 95% ✓ 95% ✓ 100% ✓ 95% ✓ 50% ✓ 95% ✓ 50% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 95%
d4a3c62e… 5 34% 34% ✓ 75% ✓ 95% ✓ 75% ✓ 50% ✓ 50% ✓ 95% ✓ 75% ✓ 75% ✓ 95% ✓ 95% ✓ 95% ✓ 100% ✓ 95% ✓ 50% ✓ 95% ✓ 50% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 95% ✓ 95%
c8c85d5a… 3 35% 31% ✓ 83% ✓ 86% ✓ 83% ✓ 83% ✓ 86% ✓ 59% ✓ 83% ✓ 83% ✓ 83% ✓ 86% ✓ 86% ✓ 100% ✓ 61% ✓ 86% ✓ 83% ✓ 85% ✓ 86% ✓ 83% ✓ 89% ✓ 86% ✓ 84% ✓ 50% ✓ 50% ✓ 50% ✓ 50%
c8c85d5a… 4 35% 31% ✓ 83% ✓ 86% ✓ 83% ✓ 83% ✓ 86% ✓ 59% ✓ 83% ✓ 83% ✓ 83% ✓ 86% ✓ 86% ✓ 100% ✓ 61% ✓ 86% ✓ 83% ✓ 85% ✓ 86% ✓ 83% ✓ 89% ✓ 86% ✓ 84% ✓ 50% ✓ 50% ✓ 50% ✓ 50%
c8c85d5a… 5 35% 31% ✓ 83% ✓ 86% ✓ 83% ✓ 83% ✓ 86% ✓ 59% ✓ 83% ✓ 83% ✓ 83% ✓ 86% ✓ 86% ✓ 100% ✓ 61% ✓ 86% ✓ 83% ✓ 85% ✓ 86% ✓ 83% ✓ 89% ✓ 86% ✓ 84% ✓ 50% ✓ 50% ✓ 50% ✓ 50%
31cef347… 3 40% 34% ✓ 75% ✓ 75% ✓ 77% ✓ 75% ✓ 78% ✓ 78% ✓ 75% ✓ 75% ✓ 75% ✓ 75% ✓ 75% ✓ 77% ✓ 75% ✓ 78% ✓ 78% ✓ 75% ✓ 78% ✓ 100% ✓ 77% ✓ 75% ✓ 75% ✓ 78% ✓ 37% ✓ 49% ✓ 36% ✓ 74% ✓ 74% ✓ 74% ✓ 74%
31cef347… 4 40% 34% ✓ 75% ✓ 75% ✓ 77% ✓ 75% ✓ 78% ✓ 78% ✓ 75% ✓ 75% ✓ 75% ✓ 75% ✓ 75% ✓ 77% ✓ 75% ✓ 78% ✓ 78% ✓ 75% ✓ 78% ✓ 100% ✓ 77% ✓ 75% ✓ 75% ✓ 78% ✓ 37% ✓ 49% ✓ 36% ✓ 74% ✓ 74% ✓ 74% ✓ 74%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (29%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (27% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
41 field(s) with ≥50% miss rate — Fields frequently not extracted: diagnosis_information (100% missing), vision_plan (100% missing), care_providers (100% missing), payer_info (100% missing), contacts (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
FAX-COVER PAGE 6 pages
Document ID Pg # Received At Extracted % Avg Conf facility_name resident_name facility_name patient_firs… facility_city patient_last… facility_add… resident_fir… facility_state resident_las… patient_midd… facility_city patient_full… facility_state date_of_birth facility_pho… gender date_of_birth gender facility_alert mrn mrn patient_uniq… patient_uniq…
4ea33941… 1 0% 0%
94149f97… 9 0% 0%
2c526417… 1 21% 26% ✓ 78% ✓ 78% ✓ 78% ✓ 100% ✓ 100% ✓ 100%
d4a3c62e… 1 21% 29% ✓ 100% ✓ 100% ✓ 100% ✓ 100% ✓ 100% ✓ 100%
bc961ab7… 4 25% 33% ✓ 86% ✓ 86% ✓ 86% ✓ 100% ✓ 100% ✓ 100% ✓ 100% ✓ 100%
c8c85d5a… 6 25% 32% ✓ 83% ✓ 83% ✓ 83% ✓ 100% ✓ 100% ✓ 100% ✓ 100% ✓ 100%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (15%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (20% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
18 field(s) with ≥50% miss rate — Fields frequently not extracted: facility_name (100% missing), facility_name (100% missing), patient_first_name (100% missing), facility_city (100% missing), patient_last_name (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
ID/Insurance Card 1 pages
Document ID Pg # Received At Extracted % Avg Conf resident_name facility_name resident_fir… facility_add… resident_las… facility_city facility_state date_of_birth facility_pho… gender mrn facility_alert patient_uniq…
c8c85d5a… 2 46% 41% ✓ 83% ✓ 86% ✓ 83% ✓ 83% ✓ 100% ✓ 100%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (46%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (41% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
7 field(s) with ≥50% miss rate — Fields frequently not extracted: facility_address (100% missing), facility_city (100% missing), facility_state (100% missing), facility_phone_number (100% missing), mrn (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
Medication List 2 pages
Document ID Pg # Received At Extracted % Avg Conf facility_name resident_name date_of_birth facility_add… resident_fir… facility_city facility_state resident_las… gender facility_pho… patient_uniq… facility_alert mrn
a2323fbf… 1 0% 0%
94149f97… 4 8% 7% ✓ 90%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (4%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (3% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
13 field(s) with ≥50% miss rate — Fields frequently not extracted: resident_name (100% missing), date_of_birth (100% missing), facility_address (100% missing), resident_first_name (100% missing), facility_city (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
ORDER 5 pages
Document ID Pg # Received At Extracted % Avg Conf both resident_name date_of_refe… facility_name suicidal primary_care… psychology_d… resident_fir… physician_si… facility_add… active_crisis referring_st… resident_las… facility_city homicidal psychiatry_m… position nurse_signat… date_of_birth ASAP testing facility_state doctors_orde… gender facility_phone doctors_orde… physician_si… facility_pho… mrn signature_of… facility_alert patient_uniq… resident_is_… telephone_or… primary_care…
d4a3c62e… 3 6% 6% ✓ 95% ✓ 100%
2c526417… 3 14% 14% ✓ 95% ✓ 100% ✓ 100% ✓ 100% ✓ 100%
4ea33941… 5 34% 29% ✓ 82% ✓ 85% ✓ 85% ✓ 55% ✓ 90% ✓ 83% ✓ 90% ✓ 100% ✓ 100% ✓ 84% ✓ 55%
94149f97… 1 34% 31% ✓ 90% ✓ 92% ✓ 90% ✓ 90% ✓ 88% ✓ 90% ✓ 90% ✓ 90% ✓ 100% ✓ 90% ✓ 92% ✓ 92%
31cef347… 5 34% 22% ✓ 75% ✓ 27% ✓ 75% ✓ 17% ✓ 100% ✓ 28% ✓ 100% ✓ 100% ✓ 100% ✓ 45% ✓ 17%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (25%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (20% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
27 field(s) with ≥50% miss rate — Fields frequently not extracted: both (100% missing), suicidal (100% missing), active_crisis (100% missing), referring_staff_name (100% missing), homicidal (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
Progress Note 3 pages
Document ID Pg # Received At Extracted % Avg Conf resident_name facility_name resident_fir… facility_city resident_las… facility_state date_of_birth facility_add… gender facility_pho… patient_uniq… facility_alert mrn
94149f97… 2 8% 7% ✓ 90%
94149f97… 6 8% 7% ✓ 90%
94149f97… 7 8% 7% ✓ 90%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (8%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (7% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
12 field(s) with ≥50% miss rate — Fields frequently not extracted: resident_name (100% missing), resident_first_name (100% missing), facility_city (100% missing), resident_last_name (100% missing), facility_state (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
Referral Order 6 pages
Document ID Pg # Received At Extracted % Avg Conf financial_re… resident_name face_sheet primary_care… depressive_d… facility_name both suicidal facility_phone hallucinations self_respons… resident_fir… physician_si… consent psychiatry_m… homicidal active_crisis date_of_refe… referral_form physician_si… memory_loss psychology_d… responsible_… resident_las… ASAP referring_st… testing doctors_orde… effective_en… date_of_birth doctors_orders depression_s… legal_guardian position urgent gender delusions doctors_orde… elopement signature_of… legal_guardi… new primary_lang… ssn legal_guardi… withdrawal nurse_signat… telephone_or… disorganized… npi obtained room_number wandering name_of_pers… relation_to_… isolation hospice_pati… contact_numb… assisted_liv… grief_loss_i… short_term_m… medicaid_pen… date_and_tim… medicaid_app… additional_i… tearfulness facility_sta… medicare_par… bereavement long_term_me… resident_is_… staff_position staff_signat… anxiety va_service_c… primary_care… signature_date anxiety_diso… primary_insu… family_relat… alcohol_subs… secondary_in… worrying interpersona… feeding_eati… agitation issues_of_de… sleep_distur… irritability issues_of_te… weight_loss anger issues_with_… refusal_low_… psychosis adjustment_d… paranoia confusion medication_r… behavioral_c… previous_men… other_reason… cognitive_te… psychologica… psychologica… psychologica… psychotic_bi… neurocogniti… trauma_stres… resistance_t… substance_re… appetite_dis… other_behavi… non_compliance high_risk_be… verbal_aggre… physical_agg… sexually_ina… obsessive_co… attention_se… medical_orga… rehabilitati… psychologica…
4ea33941… 2 57% 52% ✓ 82% ✓ 85% ✓ 82% ✓ 85% ✓ 82% ✓ 67% ✓ 100% ✓ 100% ✓ 75% ✓ 100% ✓ 83% ✓ 67% ✓ 61% ✓ 82% ✓ 85%
bc961ab7… 1 58% 50% ✓ 86% ✓ 95% ✓ 82% ✓ 82% ✓ 82% ✓ 82% ✓ 82% ✓ 100% ✓ 85% ✓ 100% ✓ 79% ✓ 100% ✓ 87% ✓ 100% ✓ 61%
c8c85d5a… 1 60% 53% ✓ 83% ✓ 86% ✓ 81% ✓ 81% ✓ 81% ✓ 81% ✓ 79% ✓ 100% ✓ 82% ✓ 100% ✓ 100% ✓ 100% ✓ 74% ✓ 80% ✓ 81% ✓ 74% ✓ 74% ✓ 82%
d4a3c62e… 2 62% 48% ✓ 74% ✓ 72% ✓ 95% ✓ 75% ✓ 74% ✓ 74% ✓ 77% ✓ 74% ✓ 74% ✓ 100% ✓ 77% ✓ 69% ✓ 100% ✓ 100% ✓ 68% ✓ 68% ✓ 71% ✓ 77% ✓ 74% ✓ 72% ✓ 77%
2c526417… 2 63% 52% ✓ 78% ✓ 74% ✓ 95% ✓ 79% ✓ 78% ✓ 77% ✓ 78% ✓ 76% ✓ 100% ✓ 81% ✓ 74% ✓ 81% ✓ 74% ✓ 74% ✓ 73% ✓ 74% ✓ 81% ✓ 80% ✓ 74% ✓ 81% ✓ 75%
31cef347… 1 70% 35% ✓ 75% ✓ 75% ✓ 50% ✓ 50% ✓ 50% ✓ 50% ✓ 100% ✓ 50% ✓ 50% ✓ 50% ✓ 100% ✓ 50% ✓ 50% ✓ 50% ✓ 100% ✓ 50%
Root Cause Analysis & Recommendations
Warning: Extraction rate below baseline (62% vs 80% target) — Some required fields are consistently missing across pages.
Identify the specific fields with low extraction rates and verify their JSON paths are correct. Consider reprocessing documents with missing fields.
Low average confidence (48% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
51 field(s) with ≥50% miss rate — Fields frequently not extracted: financial_responsibility (100% missing), physician_signature (100% missing), ASAP (100% missing), urgent (100% missing), legal_guardian_relationship (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.
Unclassified 3 pages
Document ID Pg # Received At Extracted % Avg Conf resident_name patient_firs… facility_name facility_city date_of_birth patient_last… patient_midd… facility_state patient_full… date_of_birth gender
94149f97… 3 0% 0%
94149f97… 5 0% 0%
94149f97… 8 0% 0%
Root Cause Analysis & Recommendations
Critical: Low field extraction rate (0%) — More than half of required fields are not being extracted on average.
Review the page_entities_path configuration in the pageentities table. Verify the JSON structure in page_entities matches the configured paths for this document type.
Low average confidence (0% vs 60% baseline) — The AI model is returning low-confidence extractions for this document type.
Consider providing additional training samples for this document type. Check if document image quality is poor (low DPI, skewed scans).
11 field(s) with ≥50% miss rate — Fields frequently not extracted: resident_name (100% missing), patient_first_name (100% missing), facility_name (100% missing), facility_city (100% missing), date_of_birth (100% missing)
Check that these field names and JSON paths in pageentities exactly match the keys in the page_entities JSON column. Use the document search to inspect specific pages.