Unduplicating duplicate lines with SQL and RPG

This is probably a very specific issue but I’m quite proud of the solution. And it may prove to be useful to someone else (or me at some other point) so I’m putting it here for posterity.

The issue

We receive a list of sales transactions through an EDI interface. With the implementation of a new retail management system, these transactions now need to be pointed in the direction of the new application. This has proved interesting, on occasion, as the existing system is a lot more forgiving of incomplete data than the new one.

One of these cases, we have discovered, is that for some stores we’re not receiving transaction numbers. We have dates, times, store codes, transaction line numbers – but no transaction numbers. This means that if a store manages to enter two transactions at two tills at exactly the same time, we get a whole bunch of duplicate lines in the interface extract.

This interface is a transitional one (once the roll-out is completed for all stores, it will be redundant) so I shouldn’t be spending too much time on it. Just enough to make it work.

The Solution

Putting together an SQL statement that identifies transactions with duplicate line numbers is pretty easy, as is constructing a statement to update records based on a fixed criteria. The question is, how easy is it to construct an SQL statement to quickly update records based on a dynamic criteria.

The answer is: easier than I expected.

The Implementation

The (slightly simplified) code, is pasted below without explanation but if any part of it is unclear do feel free to either leave a comment or contact me and I will be happy to clarify.

      * Program     :                                                          *
      * Description : Clean up duplicate lines                                 *
      * Written by  : Paul Pritchard                                           *
      * Date        :  2/07/2012                                               *
     D pDuplicate      DS                  QUALIFIED
     D  Date                         10A
     D  Time                          8A
     D  Store                         5S 0
     D  Register                      3S 0
     D  Transaction                   5S 0
     D  Line                          3S 0
     D SaveDate        S             10A
     D SaveTime        S              8A
     D SaveStore       S              5S 0
     D SaveReg         S              3S 0
     D SaveTxn         S              5S 0
     D SaveLine        S              3S 0
     D NewLine         S              3S 0

          // The Commitment control workaround
          exec sql Set Option Commit = *NONE;

          // Renumber any duplicate lines in the Sales File
          exsr srDuplicate;

          // And exit
          *INLR = *ON;

      * Subroutine: Renumber any duplicate lines in the Sales interface file   *

          begsr srDuplicate;

              // Initialise the work values
              exsr srReset;

              // Find the lines with duplicate numbers
              exec sql declare cSales cursor for
                      Duplication as
                      (select wrslsdate, wrslstime, wrslsistr, wrslsreg#,
                              wrslstxn#, wrslslin#, count(*)
                       from SALES
                       group by wrslsdate, wrslstime, wrslsistr, wrslsreg#,
                                wrslstxn#, wrslslin#
                       having count(*) > 1
                       order by wrslsdate, wrslstime, wrslsistr, wrslsreg#,
                                wrslstxn#, wrslslin#)
                  select wrslsdate, wrslstime, wrslsistr, wrslsreg#, wrslstxn#,
                  from SALES
                  where wrslsdate || wrslstime || char(wrslsistr) ||
                  in (select wrslsdate || wrslstime || char(wrslsistr) ||
                      char(wrslsreg#) from Duplication)
                  order by wrslsdate, wrslstime, wrslsistr, wrslsreg#,
                           wrslstxn#, wrslslin#
                  for update of wrslslin#;
              exec sql open cSales;

              // Start retrieving the data
              dou SQLCOD = 0 and SQLCOD  100;
                      exsr srRenumber;
                      exec sql
                          update SALES
                          set WRSLSLIN# = :NewLine
                          where current of cSales;

              // Close the cursor and quit
              exec sql close cSales;


      * Subroutine: Renumber                                                   *

          begsr srRenumber;

              if pDuplicate.Date  SaveDate or
                 pDuplicate.Time  SaveTime or
                 pDuplicate.Store  SaveStore or
                 pDuplicate.Register  SaveReg or
                 pDuplicate.Transaction  SaveTxn;

                  SaveDate = pDuplicate.Date;
                  SaveTime = pDuplicate.Time;
                  SaveStore = pDuplicate.Store;
                  SaveReg = pDuplicate.Register;
                  SaveTxn = pDuplicate.Transaction;

                  NewLine = 0;

              NewLine += 1;


      * Subroutine: Reset                                                      *
      *             Initialise the Save* Values                                *

          begsr srReset;

              SaveDate = *BLANKS;
              SaveTime = *BLANKS;
              SaveStore = *ZERO;
              SaveReg = *ZERO;
              SaveTxn = *ZERO;