Introduction
Electronic Health Record systems store and exchange patient health data across providers, payers, and patients. Design must prioritize data correctness, HIPAA compliance, fine-grained access control, and interoperability. Mistakes in an EHR can directly harm patients, so the system favors consistency and auditability over availability — it is acceptable to reject a write rather than risk a corrupt or unauthorized record.
Patient Record Schema
Patient (patient_id, mrn, first_name, last_name, dob, gender, ssn_hash, address, insurance_id) — the root entity. MRN (Medical Record Number) is the facility-scoped identifier; patient_id is the system primary key. SSN is stored as a hash, not plaintext.
Encounter (encounter_id, patient_id, provider_id, facility_id, encounter_type, start_time, end_time, chief_complaint, discharge_summary) — each clinical visit or admission. Encounter_type covers outpatient, inpatient, emergency, telehealth.
Diagnosis (diagnosis_id, encounter_id, icd10_code, description, is_primary, onset_date) — ICD-10 coded diagnoses associated with an encounter. is_primary distinguishes the admitting diagnosis from secondary conditions.
Medication (medication_id, patient_id, rxnorm_code, drug_name, dosage, frequency, start_date, end_date, prescriber_id) — active and historical medications. RxNorm coding enables cross-system drug interaction checking.
LabResult (result_id, patient_id, encounter_id, loinc_code, test_name, value, unit, reference_range, status, collected_at) — lab results in LOINC-coded format. Status tracks ordered/resulted/corrected/cancelled lifecycle.
FHIR Interoperability
HL7 FHIR (Fast Healthcare Interoperability Resources) is the standard RESTful API and JSON/XML resource model for clinical data exchange. FHIR defines resource types that map directly to EHR entities: Patient, Encounter, Observation (for lab results and vitals), MedicationRequest, and DiagnosticReport. The EHR exposes a FHIR API layer that translates between internal schema tables and FHIR-compliant JSON responses. This enables data exchange with external EHR systems, payers, and patient-facing apps such as Apple Health and Google Health. Bulk FHIR export (FHIR $export operation) supports data sharing with payers for value-based care programs.
HIPAA Access Controls
Access is role-based with fine-grained resource scoping. An attending physician sees the full patient record. A nurse sees clinical notes and orders. A billing staff member sees diagnoses and insurance data but not clinical notes. A patient sees their own record via the patient portal. Every data access — read or write — records a purpose-of-use log entry: (user_id, patient_id, resource_type, purpose, timestamp). Break-glass access is supported for emergencies: a clinician can override access restrictions with a documented reason, and the access is flagged for post-hoc audit review. SSN and insurance_id fields are encrypted at rest using envelope encryption; the data encryption key is stored in a KMS.
Audit Log
An append-only AccessLog table records every data interaction: (access_id, user_id, patient_id, resource_type, resource_id, action ENUM read/write/delete, timestamp, ip_address, session_id). HIPAA requires audit log retention for a minimum of 6 years. The audit log is immutable — no updates and no deletes are permitted. It is stored in a separate database with write-only access for application service accounts; only the compliance reporting service has read access. Compliance reports (who accessed a patient’s record over a date range) are generated from the audit log on demand and delivered to privacy officers.
Clinical Notes
Structured clinical notes follow the SOAP format: Subjective (patient-reported symptoms), Objective (exam findings, vitals), Assessment (clinical judgment), Plan (treatment plan). Notes are stored as text attached to an encounter. An NLP pipeline runs asynchronously after note submission, extracting structured entities — ICD-10 diagnosis codes, RxNorm medication references, LOINC lab references — for indexing and decision support. Notes are append-only versioned documents: a clinician can add an addendum but cannot overwrite the original note. Each note is e-signed with the provider’s digital certificate, creating a tamper-evident record.
Data Retention and De-identification
Patient records are retained for a minimum of 10 years after the last encounter, with state law variations requiring longer periods in some jurisdictions. For research use, records are de-identified before sharing. HIPAA provides two de-identification methods: Safe Harbor (remove all 18 specified identifiers including name, DOB, zip, dates, device IDs) or Expert Determination (statistical certification that re-identification risk is very small). De-identified datasets are stored in a separate environment and shared with researchers under signed Data Use Agreements. Re-identification of de-identified data is a HIPAA violation regardless of technical feasibility.
Frequently Asked Questions: Electronic Health Records System
How do you design a FHIR API for an Electronic Health Records system and which resource types matter most?
Model your API around the core FHIR R4 resource types: Patient (demographics and identifiers), Encounter (visit or hospitalization), Observation (vitals, lab results), Condition (diagnoses), MedicationRequest (prescriptions), AllergyIntolerance, and DocumentReference (clinical notes). Expose RESTful endpoints following the FHIR specification: GET /Patient/{id}, GET /Observation?patient={id}&category=laboratory. Use FHIR search parameters with _include and _revinclude to reduce round trips. Return OperationOutcome resources for errors. Version the API with a FHIR CapabilityStatement so clients can discover supported interactions.
What is the HIPAA break-glass access pattern and how do you implement it in an EHR system?
Break-glass allows a clinician to override normal access controls in an emergency (e.g., an unconscious patient whose records are restricted). Implement it by adding a break-glass flag to the authorization request. When asserted, the system bypasses relationship-based access checks and grants read access for a short session window (typically 4-8 hours). Every break-glass access must be logged to an immutable audit table with the clinician ID, patient ID, timestamp, and stated reason. Automated alerts notify the privacy officer and the patient after the fact. Regular audits review break-glass frequency per clinician to detect misuse.
How do you enforce immutability of audit logs in an Electronic Health Records system?
Write audit events to an append-only table with no UPDATE or DELETE grants on the audit role. Use a database sequence or UUID with a server-side timestamp so clients cannot supply their own IDs or times. Add a hash chain: each row stores the SHA-256 of its own fields concatenated with the previous row’s hash, making tampering detectable. Ship logs in near-real-time to an immutable object store (AWS S3 Object Lock in COMPLIANCE mode, or similar) so even a compromised database cannot erase them. Run a nightly verification job that recomputes the hash chain and alerts on any break.
What is the difference between HIPAA Safe Harbor and Expert Determination de-identification methods?
Safe Harbor requires removing 18 specific identifiers enumerated in the HIPAA Privacy Rule (name, geographic data smaller than state, dates other than year for patients over 89, phone, fax, email, SSN, MRN, account numbers, certificate numbers, URLs, IPs, biometrics, full-face photos, and any other unique identifier). It is straightforward but conservative and loses analytical value. Expert Determination requires a qualified statistician to certify that the re-identification risk is very small, allowing retention of more granular data. Expert Determination is preferred for research datasets where Safe Harbor would destroy utility; Safe Harbor is used for operational data sharing where simplicity and defensibility matter.
How should clinical notes versioning and e-signature be designed in an EHR system?
Store each version of a clinical note as an immutable row with a version sequence number, author ID, timestamp, and status (DRAFT, SIGNED, AMENDED, ADDENDUM). Never overwrite a signed note; instead create a new version with status AMENDED that references the prior version ID. An e-signature record captures the signing clinician’s credential, a timestamp, and an RSA or ECDSA signature over the note content hash. The signature is verified on read to detect tampering. Addenda are separate signed documents linked to the original note ID. A view layer assembles the current authoritative version by following the version chain to the latest non-superseded entry.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering