This week our SAS paper review focusses on “Proc Format, a Speedy Alternative to Sort / Sort / Merge” by Jenine Milum.
The paper explains how Proc Format can be used to combine information from multiple datasets in place of the traditional “sort/sort/merge” approach and shows how this approach can offer reductions in CPU time of >70% when dealing with large files. In the pharma world this approach has many possible usages. Consider for example the need to look up the RFSTDTC variable in SDTM.DM and merge it onto each of your SDTM datasets in ordering to derive study day variables. Using the author’s approach would instead involve building a format from SDTM.DM which maps USUBJID to RFSTDTC. RFSTDTC can then be attached within the DATAstep using a PUT statement.
We feel that this approach is particularly relevant now that the widespread adoption of the ADaM model means that companies are having to deal with ever larger analysis datasets. A nice feature of the paper is its proposal for merging with more than one variable.
The paper was presented in the Coder’s Corner section of the SAS Global Forum 2012 Conference in Orlando. Download the paper from Lex Jansen.