KX–Salto Integration: Detailed Design and Validation
- 1. Context & Motivation
- 2. High-Level Architecture
- 3. Core Design Principles
- 4. Data Model and Key Tables
- 5. Business Rules in Detail
- 6. Building the Proposed Assignment Views
- 7. Sync Mechanics and Operational Flow
- 8. Testing & Validation Framework
- 9. Test Scenarios and Simulation Harness
- 10. Performance and Indexing
- 11. Handling Real-World Data Problems
- 12. Visualisation and Stakeholder Communication
- 13. Operational Playbook
1. Context & Motivation
At St John’s College Cambridge, three independent systems have to cooperate to answer a deceptively simple question: "Which student should be able to open which door, and when?"
- The KX Room Management System holds the truth about room allocations, arrival and departure dates, and which rooms belong to which building or staircase.
- The Salto Access Control System is responsible for the physical doors and zones: who can open which lock, at what times, using which card.
- A separate card management process produces CSV exports of everyone who actually has a valid access card.
Before this project, these systems were loosely coupled via manual work and ad‑hoc updates. Accommodation staff and the security team had to:
- Manually grant or revoke Salto access when a student moved rooms.
- Remember to add bathroom or staircase access on top of the room door.
- Periodically clean up old access that should have expired but did not.
This led to predictable problems for a busy college:
- Students being locked out of their allocated rooms or shared facilities at critical times (arrival days, after room moves, early in term).
- Over‑permissive access, where a student retained access to a previous room or staircase long after moving out.
- Limited visibility of the overall access picture: no single place to answer "who can open this door today, and why?".
At the same time, any attempt to "automate everything" carried real risk. A bug would not just break a report – it could lock hundreds of students out at once or revoke access for emergency routes. That meant the solution had to be:
- Data‑driven and reproducible – access decisions must come from clearly defined rules and queryable data.
- Conservative by default – avoid removing access unless we are sure it is no longer justified.
- Auditable – we should be able to explain every proposed change in plain language.
The core motivation of this project was therefore to design a robust integration layer between KX and Salto which:
- Automatically derives the correct door and zone access from KX room allocations.
- Only operates on people who actually have cards and should be processed.
- Provides rich validation and "what would change?" views before anything is written back to Salto.
2. High-Level Architecture
Rather than pushing logic directly into either KX or Salto, we introduced a separate MySQL integration database that sits between the two systems and the card CSV feed. This database acts as the brains of the operation: it understands KX allocations, Salto’s door and zone model, and the list of real cardholders, and from that it computes the access each user should have.
At a high level, the architecture looks like this:
-
KX data import
- Nightly or on‑demand exports from KX supply room allocations (who is in which room, from when to when) and related metadata, such as building, staircase and sub‑location identifiers.
- These are loaded into integration tables like
kxroomallocations.
-
Cardholder staging
- The card system produces a CSV file of all active or valid cards.
- A load process turns this CSV into the
staging_users_doors_zonestable. - This staging table becomes the authoritative list of people the sync is allowed to touch.
-
Salto data mirror
- We import relevant Salto tables into the integration database:
salto_users,salto_doors,salto_users_locks,salto_users_zones,salto_zones,salto_zone_locksand others. - These tables reflect the current state of access control in Salto – who has access to which doors and zones today.
- We import relevant Salto tables into the integration database:
-
Proposed assignment views
- Two core views,
view_proposed_door_assignmentsandview_proposed_zone_assignments, take KX allocations, cardholder staging and Salto metadata as input. - They apply the project’s business rules to generate:
ExtDoorIDList– the set of doors a user should have.ExtZoneIDList– the set of zones a user should have.
- These are proposed states: nothing is updated in Salto yet.
- Two core views,
-
Sync procedures and scripts
- A set of SQL scripts and shell helpers under
sync/read from the proposed views and compare them to the current Salto state. - They then update the Salto user records (via
ExtDoorIDListandExtZoneIDList) for the subset of users who should be processed. - Manual access not managed by the sync is explicitly left alone.
- A set of SQL scripts and shell helpers under
-
Validation and audit layer
- Around this core pipeline we built a rich library of validation views under
validation-views/. - These answer questions like:
- Which students will lose access earlier than their KX allocation end date?
- Which doors are currently granted but will be removed by the new rules?
- Which students have KX allocations but no door or zone access?
- Separate comparison procedures track how sync runs differ over time, acting as an audit trail.
- Around this core pipeline we built a rich library of validation views under
The key architectural choice is to treat "what access should exist" as a set of pure, inspectable queries (views), and to keep the mutation of Salto’s state as a distinct, carefully controlled step on top. This separation makes the system easier to reason about, easier to validate, and safer to operate.
3. Core Design Principles
Very early on, we agreed a set of principles to guide every piece of SQL we wrote and every business rule we encoded. These principles show up throughout the views, sync scripts and validation framework.
-
Security first, but never at the expense of safety
- Removing access is as dangerous as granting it. We therefore prefer to identify and review potentially unsafe access via validation views before revoking it automatically.
- Where there is ambiguity in the data, we favour leaving access in place and surfacing a report for human review.
-
Only touch users we explicitly intend to process
- The staging table
staging_users_doors_zonesacts as a hard filter. If acrsidis not present in staging, the sync will not update their Salto access – no matter what KX says. - This protects staff and special cases from unintended changes and ensures we only touch people who actually have cards.
- The staging table
-
Preserve existing Salto intent wherever possible
- Salto already contains carefully configured access, especially for time‑limited or exceptional cases.
- The integration respects the
UsePeriodflags and existingStartDateandEndDatevalues:- Time‑limited access in Salto stays time‑limited with its original dates.
- Permanent access (
UsePeriod = 0) is not silently turned into time‑limited access.
- KX allocation dates are used to fill in the gaps, not to overwrite deliberate decisions made in Salto.
-
Make business rules explicit and inspectable
- Instead of hiding logic inside application code, we express it in clearly named views and SQL files.
- Rules like "exclude doors already covered by zones" or "treat bathroom doors via the Notes field" live in dedicated queries that can be read, tested and discussed with stakeholders.
-
Validate before, during and after sync
- Validation is not an afterthought; it is a first‑class part of the design.
- For every important rule, there is a corresponding validation or comparison view that lets us check how that rule behaves on real data.
- Before any major change to the rules, we can run scenarios and compare original, preserves‑KX and date‑extension behaviours using the test harness.
-
Design for explanation, not just correctness
- The system must not only do the right thing; it must also be explainable to non‑technical colleagues.
- We therefore back up the SQL with:
- Visual artefacts (floorplans and before or after diagrams).
- Human‑readable summaries of impact (for example, how many students gain or lose access under a new rule).
- This makes it possible to have informed conversations about trade‑offs and corner cases.
These principles kept the project grounded: whenever we faced a tricky modelling decision or a messy real‑world data problem, we could come back to them and choose the option that best aligned with security, safety, explicitness and explainability.
4. Data Model and Key Tables
Underneath the architectural picture, the project stands on a concrete data model that describes users, rooms, doors, zones and cards in a way all three systems can understand. This section introduces the most important tables and the identifiers that link them together.
4.1 Core identifiers
Two identifiers form the backbone of the integration:
-
crsid- The unique identifier for a person across KX, Salto and the card feed.
- Every major join that involves a person – from room allocations to access lists – ultimately runs through
crsid.
-
External door and zone IDs
ExtDoorIDidentifies a door in a way that both KX and Salto can agree on. For KX‑managed doors, these are typically prefixed withKX_.ExtZoneIDidentifies a zone, usually representing a building or staircase, often prefixed withKXZ_.
Keeping these IDs consistent – and correcting them when reality disagreed with the data – was a major strand of the work, supported by targeted scripts under fixing_ids/.
4.2 KX allocations
The KX side answers the question "who should be in which room, and when?"
kxroomallocations- Contains one row per person–room allocation, including at least:
crsid– who the student is.room_idor a similar room identifier.- Arrival and departure dates.
- A link to the building or sub‑location (used later for zones).
- For some rooms, an
ExtDoorIDis already known or derivable, directly linking the allocation to a Salto door.
- Contains one row per person–room allocation, including at least:
Additional KX‑derived tables capture building, staircase and sub‑location information. Together, these let us map from a room allocation to:
- The individual room door that should be granted.
- The shared facilities (bathrooms, kitchens, staircases) that need to be added alongside the room.
- The zone that represents the building or staircase entrance.
4.3 Salto users and access
On the Salto side, the data model is about who can open what, and under which conditions.
-
salto_users- One row per person with a Salto profile.
- Key columns include:
id_user– Salto’s internal numeric identifier.crsid– to link back to KX and staging.department– used to distinguish staff from students in some rules.ExtDoorIDListandExtZoneIDList– the fields we ultimately populate for the sync engine.
-
salto_users_locks- One row per user–door relationship.
- Represents direct door access.
- Includes:
id_userandid_Lock– the core relationship.UsePeriod– whether access is permanent (0) or time‑limited (1).StartDateandEndDate– used whenUsePeriod = 1.
-
salto_users_zones- One row per user–zone relationship.
- Represents zone‑based access, typically for building entrances, staircases or other grouped facilities.
These tables reflect the current reality in Salto. They are never overwritten blindly; instead, we use them both as inputs for the proposed assignment views and as baselines for validation.
4.4 Doors, zones and their relationships
Salto models the physical environment via doors and zones.
-
salto_doors- One row per door.
- Key columns include:
id_Lock– Salto’s internal key for the lock.ExtDoorID– the external identifier we match against KX.Notes– a critical free‑text field we normalise to hold a comma‑separated list of room IDs for shared facilities (for example, a bathroom serving several rooms).
-
salto_zones- One row per zone, often representing a building, staircase or cluster of doors.
- Contains
ExtZoneID, which we map from KX sub‑location information.
-
salto_zone_locks- The bridge between zones and doors.
- Each row links one
id_Zoneto oneid_Lock, describing which doors belong to which zones.
By combining salto_zone_locks with salto_users_zones, we can work out which doors a user can already reach through zone membership alone. This is essential for implementing the
zone exclusion rule later.
4.5 Cardholder staging
The card CSV feed answers a separate question: "who actually has a card that the system can act on?"
staging_users_doors_zones- Holds one row per cardholder from the CSV.
- The most important column for the integration is
crsid. - Only users who appear here are allowed to be processed by the sync.
Treating this as a first‑class table has two big advantages:
- It gives us a clear, auditable list of people whose access might change during a sync.
- It protects everyone else – especially staff or historic users – from accidental changes when we only intend to manage student room access.
5. Business Rules in Detail
With the data model in place, the heart of the project lies in the business rules: precise statements about who should get which access, and why. These rules are encoded directly into SQL views and are reflected in the validation layer.
For clarity, we present them in the same structure used in the internal project documentation, expanding each rule into its practical implications.
5.1 Rule 0 – Staging filter (cardholder gate)
Only users who appear in staging_users_doors_zones are processed by the sync.
This rule is so important that it effectively sits in front of every other decision:
-
If a
crsidis not instaging_users_doors_zones:- We do not update their
ExtDoorIDListorExtZoneIDList. - We do not add or remove any of their door or zone access.
- For the purposes of the integration, they are invisible.
- We do not update their
-
If a
crsidis instaging_users_doors_zones:- They are eligible for processing.
- Their access can be updated to match the proposed views.
This simple gate achieves several things at once:
- It ensures we only operate on people who have active cards.
- It prevents staff and special‑case users (who may appear in Salto but not in the student card feed) from being changed by accident.
- It makes it trivial to temporarily exclude a user from sync by removing them from staging.
In SQL terms, this rule appears as a mandatory join in validation and comparison views, and as a filter in the sync scripts themselves.
5.2 Rule 1 – Cardholder with KX allocation
We only derive access for users who both have a card and have at least one KX room allocation.
Even within staging, not everyone needs automated access management. For example:
- Staff may hold cards but do not have KX room allocations.
- Some users may exist in staging for other operational reasons.
The integration therefore focuses on the intersection of three sets:
- Users with a row in
kxroomallocations. - Users with a row in
salto_users. - Users present in
staging_users_doors_zones.
By doing this, we ensure that:
- Students with rooms and cards are fully managed by the sync.
- Staff and others without KX allocations are left alone.
- Orphaned records (for example, a KX allocation for someone not in staging) are highlighted by validation views rather than being processed blindly.
5.3 Rule 2 – Door matching logic
A user should get door access if the door represents their room or a shared facility associated with their room.
There are two main pathways for this:
-
Direct room door match
- The allocation’s
ExtDoorIDmatches the door’sExtDoorID. - This is the straightforward case for individual room doors.
- The allocation’s
-
Shared facility match via
Notes- Some doors represent shared bathrooms, kitchens or other facilities.
- For these, the
Notescolumn onsalto_doorsis normalised to a comma‑separated list of room identifiers. - If the student’s room appears in this list, they should receive access to that door.
This rule turns the messy reality of shared facilities into a deterministic, queryable relationship. It also allows us to extend or correct the mapping simply by updating the
Notes field in Salto and re‑running the sync, without touching application code.
5.4 Rule 3 – Zone exclusion (avoid redundant door access)
If a user already has access to a door via a zone, we generally do not also grant them direct door access for the same lock.
Zones are Salto’s way of grouping sets of doors, for example all doors in a staircase or building. Granting both zone access and per‑door access for the same lock is usually redundant and can make the access picture harder to reason about.
To avoid this, the proposed door assignments:
- Compute all doors that a user can already open by virtue of their zone memberships (via
salto_users_zonesandsalto_zone_locks). - Exclude those doors from the list of direct door grants, unless there is a specific reason to keep both.
The result:
- The
ExtDoorIDListfocuses on doors that are not already covered by zones. - Zone‑based access remains the preferred way to grant large groups of doors (for example, all entrance doors to a building).
5.5 Rule 4 – Date preservation hierarchy
Never overwrite deliberate Salto time windows with KX dates; use KX dates to fill gaps, not to erase intent.
Salto stores access periods using two concepts:
UsePeriod = 0– permanent access, often accompanied by default or historic dates.UsePeriod = 1– time‑limited access, with meaningfulStartDateandEndDatevalues.
When building proposed access, we use a clear hierarchy:
-
Existing time‑limited access in Salto wins
- If Salto already has
UsePeriod = 1for a user–door or user–zone combination, we preserve thatStartDateandEndDate.
- If Salto already has
-
Existing permanent access in Salto comes next
- If
UsePeriod = 0, we continue to treat the access as permanent, regardless of any historic dates stored in the record.
- If
-
KX allocations fill the gap
- Where there is no pre‑existing Salto period, we use the earliest arrival and latest departure from KX to propose time‑limited access.
-
Fallback to permanent where needed
- In cases where no reliable KX dates exist, we default to permanent access rather than risk cutting off a user unexpectedly.
This hierarchy ensures that:
- We respect manual or historically important Salto configurations.
- KX data is used constructively without being allowed to trample over carefully curated access windows.
5.6 Rule 5 – Shared facilities via Notes
Shared facilities are modelled through the Notes field on salto_doors, treated as a list of rooms served by that door.
Many buildings in the college – Cripps is a prominent example – have shared bathrooms and other facilities that serve a small cluster of rooms. There is no separate table in Salto
for these relationships, so we use the Notes column as a structured, cleaned list of room identifiers.
The workflow is:
- Normalise
Notesfor relevant doors so that it becomes a clean, comma‑separated list, such as:KX_360,KX_361,KX_362for a bathroom serving three rooms.
- When building proposed door assignments, look up all doors where the student’s room appears in that list.
- Grant access accordingly.
This pattern has turned out to be powerful and maintainable: accommodation and security staff can adjust the mapping by updating Notes in Salto, and the integration automatically
reflects the change on the next run.
6. Building the Proposed Assignment Views
The centrepiece of the integration is a pair of MySQL views that describe, in a purely declarative way, what access each user should have based on KX allocations and the business rules.
view_proposed_door_assignments– defines the doors that each eligible user should be able to open, and how long for.view_proposed_zone_assignments– defines the zones each user should belong to, and the dates for that membership.
These views do not modify anything by themselves. Instead, they provide a computed, inspectable target state which the sync scripts can then use to update Salto safely.
6.1 Inputs to the views
Both views rely on the same family of inputs, orchestrated via joins and filters:
- KX allocations:
kxroomallocationsand related tables. - Salto users:
salto_users(to tiecrsidtoid_user). - Cardholder staging:
staging_users_doors_zones(to enforce the staging filter). - Doors and zones:
salto_doors,salto_zones,salto_zone_locks. - Existing access:
salto_users_locks,salto_users_zones.
The views implement the business rules by:
- Selecting only users who are both in staging and have KX allocations.
- Mapping those allocations to candidate doors and zones.
- Applying date preservation logic and zone exclusion.
- Aggregating the resulting access into the list formats (
ExtDoorIDListandExtZoneIDList) that Salto expects.
6.2 view_proposed_door_assignments
This view answers: "For each student, which door locks should they have in their ExtDoorIDList, and with what date behaviour?"
Conceptually, it performs the following steps:
-
Identify eligible users
- Join
kxroomallocationswithsalto_usersandstaging_users_doors_zonesoncrsid. - Optionally filter out staff using the
departmentcolumn insalto_users.
- Join
-
Find candidate doors
- Match allocations to doors using:
- Direct
ExtDoorIDequality for room doors. Noteslookups for shared facilities (bathrooms, kitchens and so on).
- Direct
- Match allocations to doors using:
-
Apply zone exclusion
- Join candidate doors to
salto_zone_locksandsalto_users_zonesto find doors already covered by zone access. - Exclude those from the candidate set, so they are not duplicated as direct door grants.
- Join candidate doors to
-
Determine date behaviour
- Look up existing direct access in
salto_users_locksfor the same user–door combination. - Apply the date preservation hierarchy:
- If there is an existing time‑limited record, preserve its
UsePeriod,StartDateandEndDate. - If there is existing permanent access, treat it as permanent.
- Otherwise, fall back to KX allocation dates, or permanent access if KX dates are missing.
- If there is an existing time‑limited record, preserve its
- Look up existing direct access in
-
Construct
ExtDoorIDList- Combine the chosen doors and their date behaviours into the Salto‑specific list format.
- Produce one row per user, containing the final proposed
ExtDoorIDListalongside useful breakdown columns for debugging and validation.
The end result is a view that you can treat as "if we ran the sync right now, this is what each student’s door list would look like", without writing a single change to Salto.
6.3 view_proposed_zone_assignments
Zones handle building‑level access such as main entrances, staircases and global facilities. view_proposed_zone_assignments answers: "Which zones should each student belong to,
and for how long?"
The steps mirror those for doors but focus on sub‑location and building identifiers:
-
Identify eligible users
- Again, start from
kxroomallocationsjoined tosalto_usersandstaging_users_doors_zones.
- Again, start from
-
Map sub‑locations to zones
- Use KX building or sub‑location IDs to determine the appropriate
ExtZoneIDfor each allocation.
- Use KX building or sub‑location IDs to determine the appropriate
-
Merge allocations per user–zone
- For students with multiple allocations in the same building or staircase, compute an overall earliest arrival and latest departure.
-
Preserve existing zone dates
- Join to
salto_users_zonesto see if the user already has that zone. - Respect
UsePeriodand existingStartDateandEndDateusing the same hierarchy as for doors.
- Join to
-
Construct
ExtZoneIDList- Aggregate the final set of zones and their date behaviours into the list format expected by Salto.
By keeping the patterns for doors and zones parallel, the system is easier to maintain and reason about. Changes to date handling or eligibility can often be made in one place and then applied consistently to both views.
6.4 Naming and versioning
Each major iteration of the rules or date logic is captured as a new SQL file and, when needed, a new view name. The project maintains a convention that:
- The SQL filename and the view name match as closely as possible.
- Deprecated or experimental versions are clearly labelled and kept separate from the production view.
This makes it possible to:
- Compare different strategies (for example, "preserves KX" versus "date extension") side by side.
- Roll back to a previous version of the logic if new data reveals a flaw.
7. Sync Mechanics and Operational Flow
While the proposed assignment views define the target state, the sync layer is responsible for carefully bringing Salto into line with that state. This layer is implemented as a
collection of SQL scripts and shell helpers under the sync/ directory.
7.1 Overview of the sync process
At a high level, a sync run follows this pattern:
- Ensure that KX data, cardholder staging and the Salto mirror tables are all up to date.
- Refresh or validate the proposed assignment views.
- Compare the proposed door and zone lists with the current
ExtDoorIDListandExtZoneIDListinsalto_users. - For eligible users in staging, update their lists where there are differences.
- Record what changed for audit and troubleshooting.
The scripts are designed to be idempotent: running them multiple times with the same inputs will produce the same result, making them safe to schedule hourly or nightly.
7.2 Key sync scripts
The sync/queries/ directory contains the main building blocks of the sync logic, including (among others):
-
User list synchronisation scripts, which:
- Join
salto_userswith the proposed views. - Decide when and how to update
ExtDoorIDListandExtZoneIDListfor each user.
- Join
-
Structure‑level scripts for doors and zones, which:
- Keep the door and zone metadata in sync with upstream sources where needed.
-
Group and membership sync scripts, which:
- Handle group‑based access where relevant, complementing the direct door and zone lists.
Supporting these are utility queries such as:
- Preview queries to show what would change for a given run.
- Audit queries to record the before and after state of key tables.
These SQL components are typically orchestrated by lightweight shell wrappers such as hourly_sync.sh, which bundle the right sequence of commands into a single operation.
7.3 Safety guarantees during sync
Because the system controls physical access, the sync layer includes explicit safety guarantees beyond the business rules themselves:
-
Staging enforcement
- Only users whose
crsidappears instaging_users_doors_zonesare considered for updates. - This rule is enforced again at sync time, even though it also appears in the views.
- Only users whose
-
No interference with manual access
- Access that is flagged as manually managed is excluded from automated updates.
- This allows security staff to grant exceptional access for special cases without fear that the next sync will remove it.
-
Incremental changes, not wholesale replacement
- Where possible, the sync compares the current lists with the proposed lists and only updates users where something has actually changed.
- This reduces the surface area of each run and makes audits easier to reason about.
-
Auditability
- Each run can produce a record of:
- How many users were updated.
- How many door and zone assignments were added or removed.
- Any anomalies detected by validation queries run alongside the sync.
- Each run can produce a record of:
7.4 Operational cadence
In practice, the system can be run at different cadences depending on the time of year and risk appetite:
-
Daily or hourly in steady state
- Once stable, the sync can keep Salto closely in step with KX and the cardholder feed.
-
Manually with enhanced validation before critical periods
- Before the start of a term, or before a large room move event, operators can:
- Refresh KX and staging data.
- Run the validation views to spot any unexpected impacts.
- Use preview queries to see how many students will gain or lose access.
- Only then execute the full sync.
- Before the start of a term, or before a large room move event, operators can:
By treating the sync as a repeatable but inspectable process – rather than a black‑box job – the college gains both automation and control: the system does the heavy lifting, but humans remain in charge of when and how big changes are rolled out.
8. Testing & Validation Framework
Given that this system controls who can physically enter college buildings, hoping the SQL is correct was never an option. Instead, we treated validation views and test scenarios as first‑class artefacts, on the same level as the core assignment views and sync scripts.
In broad terms, the validation layer answers five categories of question:
- Security – who currently has access they should not have?
- Student experience – who is at risk of being locked out?
- Data quality – where are there conflicts or inconsistencies in KX and Salto?
- Business rule correctness – are our rules behaving the way we think?
- Change impact – what will actually change when we run the sync?
To do this, we use a combination of validation views under validation-views/ and a separate scenario harness under test-scenarios/.
8.1 Validation view philosophy
Validation views are designed more like integration tests than simple reports:
- They focus on questions the business cares about, not just technical metrics.
- They always respect the staging filter so that results reflect what the sync can actually change.
- They are separated into clear categories so that operations staff can prioritise:
- Critical security issues.
- Student‑impacting access gaps.
- Data quality anomalies.
Each view is narrowly scoped and named to describe the specific problem it detects.
8.2 Security validations
These views focus on over‑permissive or outdated access – the cases that keep security officers awake at night.
Examples include:
-
Expired access still active
- Views identify users whose time‑limited Salto access has passed its end date but remains enabled.
-
High‑risk permanent access
- Views surface users with wide‑ranging or sensitive permanent access.
- Combined with KX and staging information, this helps distinguish justified from unjustified permanent access.
-
Doors to be removed by preserves‑KX logic
- Views that show which door assignments would disappear under a given rule set allow security staff to inspect the list before any automatic removal happens.
By running these regularly, the college gains a rolling picture of where privileges have drifted away from policy, even before the new sync rules are fully implemented.
8.3 Student experience validations
These views aim to catch situations where students will not be able to access the rooms or facilities they should, given their KX allocations.
Representative examples:
-
Missing room door access
- Views find students who:
- Have active or upcoming room allocations in KX.
- Are in
staging_users_doors_zones(so they have cards). - But do not have the expected room door in their Salto access.
- Views find students who:
-
Missing zone access
- Similar checks at the zone level ensure building or staircase entrances align with KX.
-
KX allocations without cards
- Views identify students who have rooms in KX but are absent from staging.
- This often reveals operational issues (for example, students whose card data has not yet been loaded), letting teams fix the input rather than chasing missing access after the fact.
These views are critical in the run‑up to term starts and major move‑in days, when the cost of a student being locked out is high.
8.4 Data quality validations
Some of the most subtle problems are not directly about access, but about the underlying data being inconsistent or contradictory.
Key examples:
-
Room conflicts
- Views surface cases where multiple students appear to occupy the same room over overlapping dates.
- These conflicts usually require human intervention in KX before access rules can be safely applied.
-
Date mismatches
- Views compare KX allocation windows with Salto access dates.
- They highlight where Salto access ends earlier than KX allocations, or lingers far beyond them.
-
Duplicate or inconsistent access
- Views look for multiple overlapping access entries for the same user, door and zone.
- Separate views track cases where users have access that no longer corresponds to any live KX allocation.
By treating these as explicit validation artefacts, the project turns "data cleaning" from an unstructured activity into a repeatable, query‑driven process.
8.5 Business rule validation
A separate family of views verifies that the core business rules (date preservation, zone exclusion, shared facilities and so on) behave as intended when applied to real data.
For example:
- Summary views show, at a high level, how many door and zone assignments will be added or removed when the preserves‑KX logic is applied.
- Comparison views show, per user, how current Salto access compares to what a particular version of the rules would generate.
- Other views group changes by time horizon, making it easier to see whether access is being shortened, extended or left untouched.
These views form the bridge between technical rules and managerial understanding: they turn raw differences into narratives like "under this proposal, a small set of students would lose access to Cripps bathrooms three days earlier than their KX booking ends".
8.6 Before and after comparisons
Finally, stored procedures and views support longitudinal analysis of sync runs:
- They compare the results of different sync runs over time.
- They can group changes by day windows (for example, access changes in the next seven days or thirty days).
- They provide a structured way to ask "what actually changed last night?" or "how does the new rule set compare to the old one over the next term?".
These comparison tools proved especially useful during the iterative design of the rules, when multiple candidate strategies (original, preserves‑KX, date‑extension) were evaluated side by side before settling on the final approach.
9. Test Scenarios and Simulation Harness
The validation views answer questions about the real production‑like state. In parallel, we built a scenario harness under test-scenarios/ that lets us construct controlled
worlds, run the full pipeline, and see exactly how the rules behave.
This harness behaves much like an integration test suite in application development:
- Each scenario sets up a known KX and Salto state.
- The sync logic is run (often in multiple variants).
- The resulting access is captured and compared against expected outcomes.
9.1 Structure of the test harness
The test-scenarios/ directory contains three main elements:
-
Setup and reset scripts
- Scripts to reset Salto test users to a clean baseline.
- Scripts to load specific KX allocations for the scenario.
- Utilities to ensure staging matches the intended test population.
-
Runner script
- A shell script that orchestrates:
- Resetting the test state.
- Applying scenario‑specific allocations and access.
- Running the relevant version of the proposed assignment views and sync logic.
- Capturing the resulting door and zone assignments.
- A shell script that orchestrates:
-
Result snapshots
- Output files that record the state after each run.
- Variants for different rule sets (original, preserves‑KX, date extension and so on).
Together, these pieces allow us to say "given this timetable of moves and existing access, what would each strategy do?" in a repeatable way.
9.2 Scenario matrix
A scenario matrix acts as a table of contents for the test harness. It lists:
- The types of situations we care about, for example:
- Simple single‑room allocations.
- Room moves with gaps.
- Overlapping allocations.
- Shared bathrooms with complex room sets.
- Edge‑case date patterns (same‑day moves, one‑night stays, and so on).
- Which scripts set up each situation.
- Which output files capture the results.
This matrix ensures that as the rules evolved, new edge cases were added deliberately rather than discovered accidentally in production.
9.3 Example scenarios
A few concrete examples illustrate the power of the harness:
-
Overlapping room allocations
- Two rooms allocated to the same student with overlapping dates.
- The scenario checks that access does not terminate early for either room and that shared facilities behave sensibly.
-
Back‑to‑back allocations with no gap
- A student moves from one room to another on the same day.
- The scenario verifies that there is no period where the student has access to neither room, and no unintended overlap to both rooms beyond what is acceptable.
-
Bathroom sharing in complex staircases
- Cripps and similar buildings, where a bathroom serves a particular cluster of rooms.
- The scenario tests that the
Notesfield mapping correctly grants access only to the intended rooms, and that removing a room fromNotesremoves bathroom access on the next sync.
By running these scenarios across multiple rule versions and comparing the outputs, we built confidence that the final rule set behaves well across the full range of real‑world patterns seen in the college.
10. Performance and Indexing
The integration database has to join several sizeable tables on every run: user tables, allocations, doors, zones, and multiple access lists. Without careful indexing and query design, the proposed assignment views could easily have become too slow for practical use.
10.1 Why performance matters
Performance is not just a nicety here:
- The sync may run nightly or even hourly, and must finish reliably within a predictable window.
- Operators need to be able to run preview and validation queries interactively, without waiting minutes for every answer.
- Slow queries are harder to reason about and debug, especially under time pressure before term starts.
10.2 Indexing strategy
We added and tuned indexes to support the most common join paths and filters, including on:
crsidin all user and staging tables.ExtDoorIDinsalto_doorsand any KX mapping tables.ExtZoneIDinsalto_zonesand related structures.id_userandid_Lockinsalto_users_locksandsalto_zone_locks.
These indexes are documented in dedicated SQL files under create-schema/, which can be applied as part of the schema setup.
10.3 Query shaping
Beyond indexes, we shaped the views themselves to avoid unnecessary work:
- Applying the staging filter early, so that we only process users who can actually be affected by sync.
- Avoiding expensive constructs in tight loops, favouring set‑based logic where possible.
- Splitting particularly heavy logic into intermediate views when that made the planner’s job easier.
The result is a set of core views and validation queries that remain responsive even on realistic data volumes, making the system pleasant to operate and safe to schedule.
11. Handling Real-World Data Problems
No integration project survives first contact with real data. Alongside the clean architecture and rules, a significant amount of work went into discovering, understanding and handling messy or inconsistent records.
11.1 Mismatched identifiers
Despite best efforts, the IDs in KX and Salto did not always line up cleanly:
- Some room records in KX did not have a corresponding
ExtDoorIDin Salto. - Some doors in Salto lacked a clear mapping back to a KX room.
- Legacy naming schemes and manual edits created near‑duplicates and ambiguous entries.
We addressed these through:
- Targeted analysis queries to find doors or rooms with no counterpart.
- A small library of fix‑up scripts that:
- Propose consistent
ExtDoorIDvalues. - Update Salto doors to bring them into alignment with KX naming.
- Propose consistent
11.2 Normalising the Notes field
The Notes field on salto_doors is central to modelling shared facilities, but in practice it contained:
- Free‑text comments.
- Partially formatted room lists.
- Historic annotations that no longer matched reality.
We introduced a discipline around it:
- For doors that represent shared facilities,
Notesis cleaned and formatted as a strict comma‑separated list of room identifiers. - Any additional commentary is either moved elsewhere or added in a separate convention that does not interfere with parsing.
Over time, this turned Notes from a liability into a powerful extensibility point for the rules.
11.3 Dealing with legacy access
Historical access patterns – especially for long‑standing staff or special cases – could not always be neatly mapped back to current KX data.
We handled this by:
- Leaning heavily on the date‑preservation rule to avoid clobbering long‑standing, deliberate access.
- Using validation views to highlight suspicious patterns (for example, very old permanent access that no longer has an obvious justification).
- Working with stakeholders to decide which legacy access should be left untouched, which should be moved into manual‑only configuration, and which could be safely cleaned up.
In other words, we did not try to solve every historical inconsistency in SQL. Instead, we built tooling that made them visible, then collaborated with domain experts to decide what to do.
12. Visualisation and Stakeholder Communication
Although the core of the work is SQL and schema design, the project would never have succeeded without good communication with non‑technical stakeholders: accommodation, security and senior management.
12.1 Floorplans and door mapping
We used annotated floorplans to:
- Show how KX room IDs, Salto
ExtDoorIDvalues and physical rooms fit together. - Illustrate which rooms shared bathrooms and other facilities.
- Explain complex cases like the Cripps building, where staircases, corridors and bathrooms interact in non‑obvious ways.
These diagrams made it much easier to discuss questions such as "should this bathroom be shared between these three rooms or all six on the landing?" without having to dive into SQL.
12.2 Before and after examples
For key test cases, we prepared before and after snapshots:
- Screenshots of Salto user access before the new rules.
- The corresponding proposed door and zone lists after applying the integration.
- Narrative explanations of why certain doors were added or removed.
These examples helped to build trust:
- Security staff could see that existing manual overrides were being preserved.
- Accommodation staff could see that students would not lose critical access unexpectedly.
12.3 Management‑level summaries
Finally, we prepared higher‑level documents summarising:
- The main sources of risk the system addresses (over‑permissive access, students locked out, and so on).
- The controls and safeguards in place (staging filter, validation views, audit logs).
- The outcomes expected from a full rollout (fewer manual changes, clearer responsibility boundaries, better reporting).
This material allowed senior stakeholders to sign off on the project with a clear understanding of both its benefits and its limitations.
13. Operational Playbook
In practice, the day‑to‑day operation of the system is encoded directly in sync/hourly_sync.sh. Rather than a vague “run the sync” step, this script implements a concrete, ordered
pipeline that can be inspected and reasoned about.
13.1 The hourly pipeline (hourly_sync.sh)
The script runs four main stages.
-
Backup the current staging state
- Before anything else, it takes a timestamped copy of
salto.staging_users_doors_zonesinto thesalto_sync_backupsdatabase:CREATE OR REPLACE TABLE <timestamp>_staging_users_doors_zones AS SELECT * FROM salto.staging_users_doors_zones.
- This gives a point‑in‑time snapshot of what the system thought each user’s access should be before the new run.
- Before anything else, it takes a timestamped copy of
-
Refresh all source data into Maria VDT
- KX room allocations are pulled from KX live into the Maria VDT copy using
db_tool.py:python db_tool.py sync --truncate --sql queries/sync-room-allocations-from-kxlive.sql --from kxlive --to mariavdt --table kxroomallocations.
- Salto structures are pulled from the live Salto system into Maria VDT, again via
db_tool.py syncwith--truncatefor each table:- Doors (
salto_doors), users (salto_users), user–door links (salto_users_locks), user–zone links (salto_users_zones), zones (salto_zones), and zone–door links (salto_zone_locks).
- Doors (
- At the same time, the current card CSV is loaded into
staging_users_doors_zonesin Maria usingload-issued-cards-from-csv.sql. - These operations are started in parallel where safe, and the script waits for them to complete before moving on.
- KX room allocations are pulled from KX live into the Maria VDT copy using
-
Snapshot Salto state and compute proposed access lists
- With all source data refreshed, the script:
- Runs
capture-salto-sync-audit.sqlto save a snapshot of the current Salto‑mirrored state for audit and comparison. - Runs
update-staging-with-access-lists.sql, which uses theview_proposed_zone_assignments_preserves_kxandview_proposed_door_assignments_preserves_kxviews to recalculateExtDoorIDListandExtZoneIDListinstaging_users_doors_zones.
- Runs
- At this point, staging holds the new proposed access lists for every cardholder the system manages, derived from up‑to‑date KX, Salto and card data.
- With all source data refreshed, the script:
-
Export validation and impact reports
- Finally, the script creates an
exportsdirectory and callsdb_tool.py exportmultiple times in parallel to produce Excel reports for key tables and views, including:- Room mapping (
sync-kx-rooms.sql). access_date_mismatches,critical_students_losing_access_early,kx_allocations_no_cards,room_conflicts_multiple_students.students_duplicate_access,users_permanent_access,users_with_expired_access.view_sync_changes,view_sync_changes_by_days,users_wrong_room.
- Room mapping (
- A PowerShell script (
4-export-to-sharepoint.ps1) then publishes these.xlsxfiles to SharePoint so that accommodation and security staff can review them without touching SQL.
- Finally, the script creates an
13.2 Role of db_tool.py
The Python tool db_tool.py is a thin but important layer in this pipeline:
- The
syncsubcommand encapsulates “run this SQL to copy data from source X to target Y, truncating the target table first”. - The
exportsubcommand runs a query (either from a table or from a SQL file) and writes the result straight to an Excel file. - Connection details and encodings are handled once inside the script, keeping
hourly_sync.shreadable.
Crucially, db_tool.py does not contain the access rules itself; those remain in the SQL views and scripts. Python’s job is to move data between systems and to turn query results
into consumable reports, while hourly_sync.sh fixes the exact order and grouping of steps.