With a recent interface change, I was asked to go back and fix all of the affected historical data so that it matched the new requirements. Updating a field in one file with a value from a different file is something I have done several times in the past (far too many ah-hoc queries have been launched, if truth be told) and, while you do need to take a bit of care, the approach is pretty simple.
So here is an example (some names have been changed to protect the proprietary):
update target_file upd set target_field = (select source_field from source_file where source_key_1 = substr(upd.target_key, 1, 16) and digits(source_key_2) = substr(upd.target_key, 17, 5) ) where exists (select source_field from source_file where source_key_1 = substr(upd.target_key, 1, 16) and digits(source_key_2) = substr(upd.target_key, 17, 5) )
The script is pretty simple. For each record in target_file for which an associated record can be found in source_file, populate target_field with the value in source_field. Obviously, the select clauses will need to reflect the relevant keys of whatever files you happen to be using.
Inevitably, there is a gotcha: for each record in target_file that you want to update, there must be exactly one record returned by the subquery. Handling this can be split into two parts.
The first part is handled by the
where exists clause which ensures that the script will only attempt to update records in target_file if there is a record in source_file with which to update it. This ensures you don’t get caught out by subqueries that return zero records.
The second part involves that the subquery returns no more than one record for each record in target_file. This, unfortunately, cannot be solved generically – you just need to be a bit careful to ensure that the subquery selection is returning unique records. If in doubt, a variation on the below SQL can be used to validate.
select source_key_1, digits(source_key_2), count(*) from source_file group by source_key_1, digits(source_key_2) having count(*) > 1
If you can’t find a unique selection criteria, the
distinct clause may help and, if all else fails, try arbitarily using either