Imperva Cyber Community

Expand all | Collapse all

Data Classification for VSAM

  • 1.  Data Classification for VSAM

    Posted 04-20-2021 10:27
    Has anyone done data classification of VSAM files on the mainframe or know if it's supported with Imperva?

    Database Security Engineer
    Birmingham AL

  • 2.  RE: Data Classification for VSAM

    Imperva Employee
    Posted 04-21-2021 12:52

    From a data classfication standpoint, VSAM files are no different from any other non-DBMS files.  Therefore, VSAM files cannot be classified using the common relational  database classification techniques which consist of  scanning of relational DBMS catalogs, then using using the catalogue information to scan the data itself.

    As with most non-DBMS files, the meta data for the file is contained within the application, not the file system itself.   In the case of z/OS application using VSAM, field names and data types are usually kept in  in "Copybooks".  Copybooks are application descriptors of a files fields and datatypes.  Copybooks are created for inclusion into z/OS application programs, usually COBOL.

    ETL (Extract Transform and Load) consultants have been using copybooks for years as the basis for a largely manual process of exporting VSAM files to relational based data warehouses that are more easiliy accessible for decision support systems.  The key word there is "manual".  Since there can many programs and therefore many copybooks that access one file, it is a manual process to identify which copybook is most representative of the file.

    If you were to do a search on "VSAM to DB2 conversions", you will get a pretty good idea of what is involved in obtaining the VSAM file metadata.

    I have also  heard of rudimentary classification being down using only the filenames.  It is common for z/OS enterprises to name their VSAM files with application and type of data as part of the name, for example,  billing.customers or payroll.employees.  This manual yet simple technique might yield enough information to at least indicate where sensitive data resides.

    Michael Figaro
    System Engineer
    Houston TX