The “dir” Stream

dir Stream

The “dir” Stream

Introduction

This page is really just a continuation of the page on Stream (de‑)compressionpage on Stream (de‑)compression [link to StreamCompression.php]. It is separated out for my convenience as much as anything, and I won’t waste your time with any more introduction than that!

The Contents

VBA Projects seem to be stored is much the same way I store things in my attic. Some parts of it are tidy and organised, some parts of it are superficially tidy but, on close inspection, are actually rather chaotic, and everything else is awaiting organisation. The dir Stream appears to be one of the parts that looks well organised until one digs down into it, but dig down into it you must.

The dir Stream consists of a series of what Microsoft call Records that are arranged, loosely, in a sort of hierarchy. The top level records are called “PROJECTINFORMATION”, “PROJECTREFERENCES”, and “PROJECTMODULES”, and these are followed by a two byte “Terminator” of 0x0010, and a “Reserved” field that both MUST be 0x00000000 and MUST be ignored.

Project Information

The Stream begins with Project Information. The “PROJECTINFORMATION” Record consists of a series of sub‑records that must be in sequence. Although the documentation suggests a rather more complex layout, the records, with a single exception, are all in the same format: a two byte ID, followed by a four byte size, followed by data of that size. I must stress that this is merely my interpretation of the documentation taken along with observation of many projects. The sub‑records are detailed below; the upper case names are all taken from Microsoft’s documentation, except that I have removed “PROJECT” from the front of each of them in order to make them slightly more readable.

The series of Project Information Records is not specially delimited, and not of a fixed length, so the only way to identify its end is by identifying the beginning of the next series of Records, the Project References.

Project References

It was a request to be able to access this information outside Word that set me on the path writing all this. What the documentation describes as an array of Reference Records can be treated almost as the Project Information, above. Although not quite as simple as before, the data can still be viewed as a series of sub‑records, but there is an extra level to what I described as the hierarchy.

Each Project Reference begins with two “NAME” records. As with most textual data, and as you saw above, the first record holds the name in Codepage‑encoded MBCS characters and can be ignored, while the second holds the same name in UTF‑16 encoded Unicode characters. In this case the records have IDs of 0x0016 and 0x003E respectively.

The Name records are followed by a series of records, which differ according to the type of Reference; the only indication of the type of Reference is the presence of particular record types. The documentation refers to two different structures, both called LibIds, and names several different types of these LibIds. To try to summarise this in a sentence or two would, I fear, confuse, so I will simply describe each type of reference in turn.

Registered References

A Registered reference reflects an Automation Type Library, as ‘normal’ a reference as there is. It is worth stating that two basic references, those to the VBA library, and the parent application library (in this case, the Word Library) are not recorded as forming part of the VBA Project, as they are deemed part of the infrastructure and assumed always to be available.

“REGISTERED” records have an ID of 0x000D, and contain a 4‑byte Libid size (as well as the 4‑byte record size that precedes it), a LibId, and two “Reserved” fields, one of 4 bytes, and one of 2 bytes, both of which “MUST” contain binary zeroes.

The LibId is a string of textual data, containing MBCS characters, encoded using whatever CodePage was in effect when it was created, which CodePage, of course, you do not know. The text begins with “*\” (asterisk, backslash), and this is followed by a series of individual elements separated by hash (“#”) characters. The elements are:

Project References

A Project reference is, I suppose, exactly what it says it is: a reference to a project. In the case of a Word Document (but not, of course, a Word Template), its VBA Project, if any, always has a reference to its Template, so there will be, at least, one Project Reference; depending on the Project, there may be more.

“PROJECT” records have an ID of 0x000E, and contain two LibIds, called an Absolute LibId and a Relative LibId, each preceded by a 4‑byte Libid size; the LibIds are followed by a “Major Version” and a “Minor Version”, respectively 4‑byte and 2‑byte binary numbers.

The LibId in a Project Reference is not the same as the LibId in a Registered Reference. It is still a string of textual data containing MBCS characters, encoded using the CodePage that was in effect when it was created, and it still contains a series of individual elements. Here, however, following the same leading “*\”, there are only two elements, the first of which is of a fixed length, so there is no need for any separator character. The elements are:

Control References

A Control reference, according to Microsoft, specifies a reference to a twiddled type library (I kid you not) and its extended type library. A Twiddled Type Library is a modified Automation Type Library in which all controls are marked as extensible, and is generated by the VBE when controls are added. In practice, I think this means when a Project has a UserForm.

“CONTROL” records are, effectively, a series of records documented as one, with a couple of minor fields tacked on the end. Following the two “NAME” records (MBCS and Unicode), they consist of:

The Project Reference Records, just like the Project Information Records before them, are neither fixed size nor delimited, and their end can only be deduced by the presence of the first of the Project Module Records that follow.

Project Modules

The final series of records is one of “PROJECTMODULES”. There are two small records followed by a series of sub‑records for each module in the project. The first record, the “MODULES” record itself, if you like, has an ID of 0x000F, a size of 2, and contains a 2‑byte count of the number of series of individual Module Records that exist in the project. This is followed by a “COOKIE” Record, which has an ID of 0x0013, a size of 2, and a 2‑byte binary value that “MUST” be 0xFFFF (and usually isn’t), and “MUST be ignored”. The individual sub‑records for each module are detailed, in sequence, below, and, as before, the upper case names are all taken from Microsoft’s documentation, except that I have removed “MODULE” from the front of each of them to make them more readable.

Some VBA

For myself, I have this set up as a Class Module, and it uses several Windows API Calls. I may share this with you later, but, as I have said before, I do not like providing example code of this sort: readers who are comfortable with this kind of thing can, surely, code their own, whilst those who are not can all too easily make serious mistakes.

What follows here is some relatively straightforward code that works sequentially through the Stream gathering details to enable whatever further action you may wish to take. This code builds on the decompression code shown on the previous page.

First of all, quite a lot of declarations are made at module level. Comments in the code should explain all of them. This has to go somewhere in the module before the first procedure: I have it after the Directory Type definitions and before the module level declarations.

' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' Declarations for dir Stream processing. Firstly a simple flag to control processing.  '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

Private Enum RecordSeries
    RecordSeries_PROJECTINFORMATION
    RecordSeries_PROJECTREFERENCES
    RecordSeries_PROJECTMODULES
    RecordSeries_TERMINATOR
End Enum
Private ProjectRecordSeries As RecordSeries
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' All the record types are listed here. Most of the names are as given in the Microsoft '
' documentation; where these don't exist, I have tried to choose appropriate ones. For  '
' clarity, I have given the values as comments, which can be aligned without the VBE    '
' removing all the multiple spaces, and values are given in code only where necessary,  '
' because not all possible values are used. The actual (sub-)record types are 2-byte,   '
' Integer values, and the 4-byte Enumeration values are used as comparands only.        '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

Private Enum RecordType
    
    PROJECTSYSKIND = &H1&                       ' 0001
    PROJECTLCID                                 ' 0002
    PROJECTCODEPAGE                             ' 0003
    PROJECTNAME                                 ' 0004
    PROJECTDOCSTRING                            ' 0005
    PROJECTHELPFILEPATH                         ' 0006
    PROJECTHELPCONTEXT                          ' 0007
    
    PROJECTLIBFLAGS                             ' 0008
    PROJECTVERSION                              ' 0009
    PROJECTCONSTANTS = &HC&                     ' 000C
    REFERENCEREGISTERED                         ' 000D
    REFERENCEPROJECT                            ' 000E
    MODULESCOUNT                                ' 000F
    
    TERMINATOR                                  ' 0010
    PROJECTCOOKIE = &H13&                       ' 0013
    PROJECTLCIDINVOKE                           ' 0014
    REFERENCENAME = &H16&                       ' 0016
    
    MODULENAME = &H19&                          ' 0019
    MODULESTREAMNAME                            ' 001A
    MODULEDOCSTRING = &H1C&                     ' 001C
    MODULEHELPCONTEXT = &H1E&                   ' 001E
    
    MODULETYPEMODULERECORD = &H21&              ' 0021
    MODULETYPECLASSMODULERECORD                 ' 0022
    MODULEREADONLY = &H25&                      ' 0025

    MODULEPRIVATE = &H28&                       ' 0028
    MODULETERMINATOR = &H2B&                    ' 002B
    MODULECOOKIE                                ' 002C
    REFERENCECONTROL = &H2F&                    ' 002F
    
    REFERENCEEXTENDED                           ' 0030
    MODULEOFFSET                                ' 0031
    MODULESTREAMNAMEUNICODE                     ' 0032
    REFERENCEORIGINAL                           ' 0033
    
    PROJECTCONSTANTSUNICODE = &H3C&             ' 003C
    PROJECTHELPFILEPATH2                        ' 003D
    REFERENCENAMEUNICODE                        ' 003E
    
    PROJECTDOCSTRINGUNICODE = &H40&             ' 0040
    MODULENAMEUNICODE = &H47&                   ' 0047
    
    MODULEDOCSTRINGUNICODE                      ' 0048
    
End Enum

' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' These enumerations are not actually needed; they show possible values of some records '
' and, I hope, add a little clarity.                                                    '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

Private Enum SysKind
    SysKind_16bitWindows
    SysKind_32bitWindows
    SysKind_Macintosh
    SysKind_64bitWindows
End Enum
Private Enum Lcid
    Lcid_US_English = &H409&
End Enum
Private Enum Codepage
    Windows_1252 = 1252
End Enum

Private Enum LibIdKind
    StandaloneWindowsFilePath = &H41&           ' 41
    StandaloneMacFilePath                       ' 42
    EmbeddedWindowsFilePath                     ' 43
    EmbeddedMacFilePath                         ' 44
    WindowsFilePath = &H47&                     ' 47
    MacFilePath                                 ' 48
End Enum

Private Enum LibIdType
    LibidTypeOrdinary
    LibidTypeTwiddled
    LibidTypeExtended
    LibidTypeProjectAbsolute
    LibidTypeProjectRelative
End Enum

Private Enum ModuleType
    ModuleTypeModule
    ModuleTypeClassModule
End Enum

' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' Typedefs for References.                                                              '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

Private Type GUID
    Data1 As Long
    Data2 As Integer
    Data3 As Integer
    Data4(7) As Byte
End Type

Private Type LibId
    Type                        As LibIdType
    Kind                        As LibIdKind
    GUIDString                  As String
    MajorVersion                As String
    MinorVersion                As String
    Lcid                        As String
    Path                        As String
    RegName                     As String
End Type
Private Type ProjectLibId
    Type                        As LibIdType
    Kind                        As LibIdKind
    Path                        As String
End Type

Private Type RegisteredReference
    LibId                       As LibId
End Type
Private Type ProjectReference
    ProjectLibId(0 To 1)        As ProjectLibId
    MajorVersion                As Long
    MinorVersion                As Integer
End Type
Private Type ControlReference
    LibId(0 To 2)               As LibId
    ExtendedName                As String
    OriginalTypeLib             As GUID
    OriginalTypeLibString       As String
    Cookie                      As Long
End Type

' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' Structures for the project information, references, and modules data.                 '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

Private Type ProjectInformation
    SysKind                     As SysKind
    Lcid                        As Lcid
    LcidInvoke                  As Lcid
    Codepage                    As Integer
    Name                        As String
    Description                 As String
    HelpFilePath                As String
    HelpContext                 As Long
    LibFlags                    As Long
    VersionMajor                As Long
    VersionMinor                As Integer
    Constants                   As String
End Type

Private Type ProjectReferences
    Type                        As RecordType
    Name                        As String
    Registered                  As RegisteredReference
    Project                     As ProjectReference
    Control                     As ControlReference
End Type

Private Type ProjectModules
    Type                        As ModuleType
    Name                        As String
    ReadOnly                    As Boolean
    Private                     As Boolean
    StreamName                  As String
    Description                 As String
    Offset                      As Long
    HelpContext                 As Long
End Type


Private Project                 As ProjectInformation
Private Reference()             As ProjectReferences
Private Module()                As ProjectModules

' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' End of dir Stream-specific declarations.                                              '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

The new code to extract the contents of the “dir” stream involves several routines and, altogether, quite a lot of code, so I will present it bit by bit. After uncompressing the stream, a single line added to the main driving routine is all that is needed to invoke a routine that controls this process. At the end of the main routine, immediately after decompressing the stream is a call to the extract routine.

    Call DecompressContainer(Stream, Compndx, DeCompressedData) ' Existing code
    Call ExtractdirData(DeCompressedData)                       ' New Call

.. and the new routine, which, itself, just calls other routines, looks like this:

Private Sub ExtractdirData(Stream() As Byte)

    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    ' The dir Stream consists of a series of what Microsoft call Records that are '
    ' arranged, loosely, in a sort of hierarchy. The top level ones are called    '
    ' PROJECTINFORMATION, PROJECTREFERENCES, and PROJECTMODULES, and these are    '
    ' followed by a "Terminator" record with an ID of 0x0010, and a size of zero. '
    ' Several 'records' contain fields that "MUST be ignored"; whether they all   '
    ' really should be ignored is a moot point, but if they should, one can only  '
    ' wonder why they are present.                                                '
    ' Some, though not all, strings are held in both MBCS format and Unicode (and '
    ' one, rather oddly, twice in MBCS format). Although there may be some merit  '
    ' in doing this, in practice the MBCS strings do not appear to be correctly   '
    ' held, so no attempt is made to map them to the right code page. As the VB   '
    ' Editor has no capacity for input of characters that might require a multi-  '
    ' byte encoding, I rather suspect that Microsoft are being, shall we say,     '
    ' disingenuous with the MBCS suggestion, and I believe them to be, and treat  '
    ' them as, simple, single-byte, ANSI characters - until proved otherwise :-)  '
    ' Where possible, Unicode versions are extracted and MBCS ones ignored.       '
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    
    Dim dirndx                  As Long

    ProjectRecordSeries = RecordSeries_PROJECTINFORMATION
    dirndx = 0

    Do 
    
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ' This driver just calls other routines to do its work!                   '
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '

        Select Case ProjectRecordSeries
            Case RecordSeries_PROJECTINFORMATION
                ExtractInformation Stream, dirndx
            Case RecordSeries_PROJECTREFERENCES
                ExtractReferences Stream, dirndx
            Case RecordSeries_PROJECTMODULES
                ExtractModules Stream, dirndx
            Case RecordSeries_TERMINATOR
                Exit Do
            Case Else
                Debug.Print "Unknown dir Stream Record"     ' Should never happen
                Exit Do
        End Select
        
    Loop

End Sub

The only reason I have for writing separate routines is to avoid having one very long routine. The first routine, works through the Project Information data, in fairly fixed format at the start of the stream. You will note that there are some calls to Functions you don't recognise, although I hope you can guess what they do. All will, as they say, be revealed if you read on.

Private Sub ExtractInformation(ByRef Stream() As Byte, _
                               ByRef ndx As Long)
    
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    ' The PROJECTINFORMATION Record consists of a series of sub-records that are  '
    ' in sequence and are all always present. Although the documentation suggests '
    ' a rather more complex layout, the sub-records, with a single exception, are '
    ' all in the same format: a 2-byte ID, followed by a four byte size, followed '
    ' by data of that size; this is merely my interpretation, however, and things '
    ' may change in future releases.                                              '
    ' Although there is, as there always seems to be, a special case, one record  '
    ' where the "Size" is not the size (the Project Version Record - type 9), the '
    ' size of the 2-byte and 4-byte numeric values (Integers and Longs) is known, '
    ' and the size of String values is easily determinable, using Len or LenB  as '
    ' appropriate, so what is held in the stream can be ignored.                  '
    ' The absolute increments in the following code are: 2 for the length of the  '
    ' Record Id, and 6 for the combined length of Record Id and size.             '
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    
    Dim Unicode                 As Boolean
    
    With Project
    
        .SysKind = GetLong(Stream, ndx + 6)
        ndx = ndx + 6 + Len(.SysKind)

        .Lcid = GetLong(Stream, ndx + 6)       ' Always 0x0409 = US English
        ndx = ndx + 6 + Len(.Lcid)

        .LcidInvoke = GetLong(Stream, ndx + 6) ' Always 0x0409
        ndx = ndx + 6 + Len(.LcidInvoke)

        .Codepage = GetInt(Stream, ndx + 6)    ' Always seems to be 1252
        ndx = ndx + 6 + Len(.Codepage)

        Unicode = False
        .Name = GetString(Stream, ndx + 6, GetLong(Stream, ndx + 2), Unicode)
        ndx = ndx + 6 + IIf(Unicode, LenB(.Name), Len(.Name))
        
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ' Ignore the MBCS version of the PROJECTDOCSTRING (Microsoft's name for   '
        ' the Description), and just take the Unicode version.                    '
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ndx = ndx + 6 + GetLong(Stream, ndx + 2)
        Unicode = True
        .Description = GetString(Stream, ndx + 6, GetLong(Stream, ndx + 2), Unicode)
        ndx = ndx + 6 + IIf(Unicode, LenB(.Description), Len(.Description))

        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ' After storing the PROJECTHELPFILEPATH value, ignore the duplicate value '
        ' on the PROJECTHELPFILEPATH2 sub-record.                                 '
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        Unicode = False
        .HelpFilePath = GetString(Stream, ndx + 6, GetLong(Stream, ndx + 2), Unicode)
        ndx = ndx + 6 + IIf(Unicode, LenB(.HelpFilePath), Len(.HelpFilePath))
        ndx = ndx + 6 + GetLong(Stream, ndx + 2)

        .HelpContext = GetLong(Stream, ndx + 6)
        ndx = ndx + 6 + Len(.HelpContext)

        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ' The documentation for PROJECTLIBFLAGS, on the one hand, refers to the   '
        ' [MS-OAUT] documentation for Automation while, on the other hand, saying '
        ' that this must be 0x00000000. Who knows? Just extract it.               '
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        .LibFlags = GetLong(Stream, ndx + 6)
        ndx = ndx + 6 + Len(.LibFlags)

        .VersionMajor = GetLong(Stream, ndx + 6)
        .VersionMinor = GetInt(Stream, ndx + 6 + Len(.VersionMajor))
        ndx = ndx + 6 + Len(.VersionMajor) + Len(.VersionMinor)

        Unicode = True
        ndx = ndx + 6 + GetLong(Stream, ndx + 2)
        .Constants = GetString(Stream, ndx + 6, GetLong(Stream, ndx + 2), Unicode)
        ndx = ndx + 6 + IIf(Unicode, LenB(.Constants), Len(.Constants))

    End With

    ProjectRecordSeries = ProjectRecordSeries + 1
    
End Sub

This is followed by two similar routines: one for the Reference details, which contains some calls to further sub‑routines you have yet to see:

Private Sub ExtractReferences(ByRef Stream() As Byte, _
                              ByRef ndx As Long)

    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    ' The PROJECTREFERENCES Record contains sub-records for an indeterminate      '
    ' number of references. The basic logic is to extract the Record Id and the   '
    ' size, and then to process 'size' bytes appropriately according to the Id.   '
    ' Within each reference the sub-records, varying according to the type of the '
    ' reference, are in a fixed sequence and there are some dependencies on that  '
    ' sequence. In particular the NAME sub-record is always the first one, but    '
    ' NAME sub-records may also be embedded in extended references (see "Control" '
    ' records below), and a flag is maintained to indicate which may be the case. '
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    Dim RecordId                As Integer
    Dim size                    As Long
    Dim LibIdSize               As Long

    Dim IsControlReference      As Boolean

    Do
    
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ' Extract the Record Id and Size, and increment the index into the chunk. '
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        
        RecordId = GetInt(Stream, ndx)
        size = GetLong(Stream, ndx + 2)
        ndx = ndx + 6
        
        Select Case RecordId

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' If not currently mid-process amongst the various sub-records that   '
            ' make up a Control Reference, a NAME sub-record signifies the start  '
            ' of a new reference, so the array is extended. Whichever case, the   '
            ' contents are in MBCS format and are ignored, preference being given '
            ' to the Unicode version that follows.                                '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case REFERENCENAME
                If Not IsControlReference Then
                    If (Not Reference) = True Then
                        ReDim Reference(0 To 0)
                    Else
                        ReDim Preserve Reference(LBound(Reference) To _
                                                 UBound(Reference) + 1)
                    End If
                End If
            Case REFERENCENAMEUNICODE
                If Not IsControlReference Then
                    Reference(UBound(Reference)).Name _
                            = GetString(Stream, ndx, size, Unicode:=True)
                Else
                    Reference(UBound(Reference)).Control.ExtendedName _
                            = GetString(Stream, ndx, size, Unicode:=True)
                End If

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' Name records are followed by one of three further records: Control, '
            ' Registered, or Project, all of which hold detail in structures      '
            ' called "LibId"s. I don't really understand the need for all the     '
            ' complication, but a Registered reference simply contains a LibId, a '
            ' Project reference holds an "absolute" and a "relative" LibId, and a '
            ' Control reference has a structure containing an "Original" record a '
            ' "Twiddled" record, and an "Extended" record, each containing,       '
            ' amongst other fields, a LibId.                                      '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The Reference Registered record contains a 4-byte Libid length, the '
            ' LibId itself, and two fields, called "Reserved1" and "Reserved2",   '
            ' which are, respectively, four and two bytes of binary zeroes.       '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case REFERENCEREGISTERED
                With Reference(UBound(Reference))
                    .Type = REFERENCEREGISTERED
                    StoreLibid .Registered.LibId, _
                               GetString(Stream, ndx + 4, GetLong(Stream, ndx), _
                                         Unicode:=False), _
                               LibidTypeOrdinary
                End With

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The Reference Project record contains a 4-byte length of its        '
            ' "absolute" LibId and then the LibId itself, a 4-byte length of its  '
            ' "relative" Libid, and then that Libid itself, followed by a major   '
            ' version and a minor version. Although the documentation calls these '
            ' "LibIds", they are not LibIds like the others; they are Project     '
            ' References in a somewhat different format. Obfuscation seems to be  '
            ' prevalent in the documentation as much as in the data.              '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case REFERENCEPROJECT
                With Reference(UBound(Reference))
                    .Type = REFERENCEPROJECT
                    StoreProjectReference .Project, Stream, ndx
                End With
            
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The Reference Control record starts with an "Original" sub-record,  '
            ' which should contain a 4-byte length of the original Libid, but     '
            ' which may not exist. This is followed by a "Twiddled" sub-record    '
            ' (please don't ask what twiddling is) which is just like a           '
            ' Registered record, and this is then followed by an "Extended Name"  '
            ' record (as per the "Name" record above).                            '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case REFERENCEORIGINAL
                With Reference(UBound(Reference))
                    .Type = REFERENCECONTROL
                    StoreLibid .Control.LibId(0), _
                               GetString(Stream, ndx, size, Unicode:=False), _
                               LibidTypeOrdinary
                End With
                IsControlReference = True

            Case REFERENCECONTROL
                LibIdSize = GetLong(Stream, ndx)
                StoreLibid Reference(UBound(Reference)).Control.LibId(1), _
                           GetString(Stream, ndx + 4, LibIdSize, Unicode:=False), _
                           LibidTypeTwiddled

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The final element of the control reference is the "Extended" record,'
            ' another Libid, preceded by its size, and followed by a two fields,  '
            ' a 4-byte "Reserved4" and a 2-byte "Reserved5", to be ignored, and   '
            ' then a GUID (this time a 'real' one, not a string) and a "cookie".  '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case REFERENCEEXTENDED
                LibIdSize = GetLong(Stream, ndx)
                With Reference(UBound(Reference)).Control
                    StoreLibid .LibId(2), _
                               GetString(Stream, ndx + 4, LibIdSize, Unicode:=False), _
                               LibidTypeExtended
                    .OriginalTypeLib = GetGUID(Stream, ndx + 4 + LibIdSize + 6)
                    .OriginalTypeLibString = GUIDtoString(.OriginalTypeLib)
                    .Cookie = GetLong(Stream, ndx + size - 4)
                End With
                IsControlReference = False

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The only way to get here should be with a sub-record that does not  '
            ' belong to a reference. In practice, this will be the first Modules  '
            ' Record but, in theory, this routine has no knowledge of what comes  '
            ' next, so it simply flags the end of its work and makes the record   '
            ' ready for processing by any further logic there may be.             '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case Else
                ProjectRecordSeries = ProjectRecordSeries + 1
                ndx = ndx - 6
                size = 0

        End Select ' Record Type
        
        ndx = ndx + size
    
    Loop While ProjectRecordSeries = RecordSeries_PROJECTREFERENCES ' For next sub-record

End Sub

.. and one for the Module details:

Public Sub ExtractModules(ByRef Stream() As Byte, _
                          ByRef ndx As Long)
    
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    ' The PROJECTMODULES Record contains a series of sub-records for each module  '
    ' in the Project. The basic logic is the same as for the References (q.v.).   '
    ' There is an assortment of sub-records per module; they are in a particular  '
    ' sequence but the only real dependencies are that the count comes first and  '
    ' that the NAME sub-record is the first one for each individual module.       '
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    Dim RecordId                As Integer
    Dim size                    As Long
    
    Dim ndxModule               As Long

    Do
    
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        ' Extract the Record Id and Size, and increment the index into the chunk. '
        ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
        
        RecordId = GetInt(Stream, ndx)
        size = GetLong(Stream, ndx + 2)
        ndx = ndx + 6
        
        Select Case RecordId
        
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The first record contains a count of modules so the array can be    '
            ' set to the right dimensions immediately. The "Cookie" is irrelevant.'
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case MODULESCOUNT
                ReDim Module(1 To GetInt(Stream, ndx))
                    
            Case PROJECTCOOKIE

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' NAME sub-records indicate a new module, so increment the index.     '
            ' The MBCS name, itself, is ignored in favour of the Unicode one.     '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case MODULENAME
                ndxModule = ndxModule + 1
            Case MODULENAMEUNICODE
                Module(ndxModule).Name = GetString(Stream, ndx, size, Unicode:=True)

            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The MBCS-format Stream name, and Description, like the MBCS-format  '
            ' Name above, are ignored in favour of the Unicode versions.          '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case MODULESTREAMNAME
            Case MODULESTREAMNAMEUNICODE
                Module(ndxModule).StreamName _
                        = GetString(Stream, ndx, size, Unicode:=True)

            Case MODULEDOCSTRING
            Case MODULEDOCSTRINGUNICODE
                Module(ndxModule).Description _
                        = GetString(Stream, ndx, size, Unicode:=True)

            Case MODULEOFFSET
                Module(ndxModule).Offset = GetLong(Stream, ndx)

            Case MODULEHELPCONTEXT
                Module(ndxModule).HelpContext = GetLong(Stream, ndx)

            Case MODULECOOKIE
                
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' Several Module sub-records contain no data, their presence, or      '
            ' absence, indicating attributes of the module.                       '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case MODULETYPEMODULERECORD
                Module(ndxModule).Type = ModuleTypeModule

            Case MODULETYPECLASSMODULERECORD
                Module(ndxModule).Type = ModuleTypeClassModule

            Case MODULEREADONLY
                Module(ndxModule).ReadOnly = True

            Case MODULEPRIVATE
                Module(ndxModule).Private = True

            Case MODULETERMINATOR
            
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            ' The only way to get here should be with a sub-record that does not  '
            ' belong to a module. In practice, this can only be the terminator    '
            ' Record but, in theory, this routine has no knowledge of what comes  '
            ' next, so it simply flags the end of its work and makes the record   '
            ' ready for processing by any further logic there may be.             '
            ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
            Case Else
                ProjectRecordSeries = ProjectRecordSeries + 1
                ndx = ndx - 6
                size = 0

        End Select ' Record Type

        ndx = ndx + size

    Loop While ProjectRecordSeries = RecordSeries_PROJECTMODULES ' For next sub-record

End Sub

Now for the assorted functions called by the above routines. There are, if you like, two categories, those specific to the references, which are:

Private Sub StoreLibid(ByRef LibId As LibId, _
                       ByRef LibIdString As String, _
                       ByVal LibIdType As LibIdType)

    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    ' References are stored in one sort or another of Reference record, and they  '
    ' all use a "LibId", a string of sub-fields, separated by "#" characters, the '
    ' only slight exception being the dot (period) that separates the major and   '
    ' minor versions. This simple routine just stores the separate sub-fields in  '
    ' order to keep the calling code as free from clutter as possible.            '
    ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
    
    Dim LibIdSplit

    LibIdSplit = Split(LibIdString, "#")
    
    With LibId
        
        .Type = LibIdType
        
        .Kind = AscW(Mid$(LibIdSplit(0), 3, 1))
        .GUIDString = Mid$(LibIdSplit(0), 4)
        .MajorVersion = Split(LibIdSplit(1), ".")(0)
        .MinorVersion = Split(LibIdSplit(1), ".")(1)
        .Lcid = LibIdSplit(2)
        .Path = LibIdSplit(3)
        .RegName = LibIdSplit(4)
    
    End With

End Sub
 
Private Sub StoreProjectReference(ByRef Project As ProjectReference, _ ByRef Stream() As Byte, _ ByVal Streamndx As Long) ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' This is not ideal but it moves a chunk of code out of the mainline. Project ' ' references consist of an 'absolute' path, a 'relative' path, and major and ' ' minor versions (the real meaning of which I do not know). ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' Dim size As Long size = GetLong(Stream, Streamndx) With Project.ProjectLibId(0) .Type = LibidTypeProjectAbsolute .Kind = Stream(Streamndx + 6) ' Past size and "*\" .Path = GetString(Stream, Streamndx + 7, size - 3, Unicode:=False) End With Streamndx = Streamndx + 4 + size ' Skip over absolute size = GetLong(Stream, Streamndx) With Project.ProjectLibId(1) .Type = LibidTypeProjectRelative .Kind = Stream(Streamndx + 6) ' Past size and "*\" .Path = GetString(Stream, Streamndx + 7, size - 3, Unicode:=False) End With Streamndx = Streamndx + size + 4 ' Skip over relative Project.MajorVersion = GetLong(Stream, Streamndx) Project.MinorVersion = GetInt(Stream, Streamndx) End Sub
Private Function GetGUID(ByRef Stream() As Byte, _ ByVal Streamndx As Long) _ As GUID Dim ndx As Long With GetGUID .Data1 = GetLong(Stream, Streamndx) .Data2 = GetInt(Stream, Streamndx + 4) .Data3 = GetInt(Stream, Streamndx + 6) For ndx = 0 To 7 .Data4(ndx) = Stream(Streamndx + 8 + ndx) Next ndx End With End Function
Private Function GUIDtoString(ByRef GUID As GUID) As String ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' A simple conversion of a standard format GUID to its string representation. ' ' This is not strictly necessary; it just makes life easier for human readers.' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' Dim ndx As Long Dim GUIDString As String With GUID GUIDtoString = "{" & Right("00000000" & Hex(.Data1), 8) & "-" GUIDtoString = GUIDtoString & Right("0000" & Hex(.Data2), 4) & "-" GUIDtoString = GUIDtoString & Right("0000" & Hex(.Data3), 4) & "-" For ndx = 0 To 7 GUIDtoString = GUIDtoString & Right("00" & Hex(.Data4(ndx)), 2) If ndx = 1 Then GUIDtoString = GUIDtoString & "-" Next ndx End With GUIDtoString = GUIDtoString & "}" End Function

.. and those more general routines, which serve only in place of the API calls I prefer not to use:

' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
' As I have said elsewhere, I don't like using API calls for demonstrations; there are  '
' just too many pitfalls for the unwary. Although I do use them in my own code, here I  '
' have equivalent routines, using native VBA, to move 2, and 4, bytes to, respectively, '
' an Integer and a Long, and to move an arbitrary number of bytes to a String.          '
' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' '
Private Function GetInt(ByRef Stream() As Byte, _
                        ByVal ndx As Long) _
                 As Integer
    
    GetInt = Stream(ndx) + _
             (Stream(ndx + 1) Mod 128) * 256& _
             + (Stream(ndx + 1) >= 128) * 32768

End Function
 
Private Function GetLong(ByRef Stream() As Byte, _ ByVal ndx As Long) _ As Long GetLong = Stream(ndx) + _ Stream(ndx + 1) * 256& + _ Stream(ndx + 2) * 65536 + _ (Stream(ndx + 3) Mod 128) * 16777216 If (Stream(ndx + 3) >= 128) Then GetLong = GetLong - &H40000000 - &H40000000 End If End Function
Private Function GetString(ByRef Stream() As Byte, _ ByVal ndx As Long, _ ByVal size As Long, _ ByVal Unicode As Boolean) _ As String Dim TempString As String TempString = "" For ndx = ndx To ndx + size - 1 TempString = TempString & Chr(Stream(ndx)) Next ndx If Unicode Then TempString = StrConv(TempString, vbFromUnicode) End If GetString = TempString End Function

As before, you won’t actually see anything happen when you run this code. All that has actually happened is that the data has been reformatted from one arcane structure to another, possibly only slightly less arcane. What you do with the extracted data is up to you, but a simple routine to show some of it may help. This routine is not very good but it shows, I hope, how the contents of the stream are a little more available than they were. Add it somewhere to your module:

Private Sub ShowDetails()
    
    Dim ndx         As Long
    Dim Msg         As String
    
    Msg = "Project " & Project.Name
    Msg = Msg & " with description: """ & Project.Description & """" & vbCr
    Msg = Msg & " .. has " & UBound(Reference) & " references, to: " & vbCr
    
    For ndx = LBound(Reference) To UBound(Reference)
        With Reference(ndx)

            Msg = Msg & " ..  .. " & .Name
            
            Select Case .Type
                Case REFERENCEREGISTERED
                    Msg = Msg & "(" & .Registered.LibId.Path & ")" & vbCr
                Case REFERENCEPROJECT
                    Msg = Msg & "(" & .Project.ProjectLibId(0).Path & ")" & vbCr
                Case REFERENCECONTROL
                    Msg = Msg & "(" & .Control.LibId(0).Path & ")" & vbCr
            End Select
        
        End With
    Next

    Msg = Msg & " .. and contains " & UBound(Module) & " modules:" & vbCr
    
    For ndx = 1 To UBound(Module)
        With Module(ndx)
            Msg = Msg & " ..  .. " & .Name & " starting at offset " & _
                        .Offset & " in stream " & .StreamName & vbCr
        End With
    Next

    Debug.Print Msg

End Sub

.. and, finally, add a Call to it from the main driver, after the Call to the ExtractdirData routine you've just seen. I hope you have been able to follow all that, and have now got your own code to read the “dir” stream. If not, I have saved my work and it is available here: Stylised text masquerading as a button [link to the file on this site at files/dirStream.zip] --- OR SOON WILL BE!!!!