Bitcode: Collect all MDString records into a single blob

Optimize output of MDStrings in bitcode. This emits them in big blocks (currently 1024) in a pair of records: - BULK_STRING_SIZES: the sizes of the strings in the block, and - BULK_STRING_DATA: a single blob, which is the concatenation of all the strings. Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. I needed to add support for blobs to streaming input to get the test suite passing. - StreamingMemoryObject::getPointer reads ahead and returns the address of the blob. - To avoid a possible reallocation of StreamingMemoryObject::Bytes, BitstreamCursor::readRecord needs to move the call to JumpToEnd forward so that getPointer is the last bitstream operation. llvm-svn: 264409
2016-03-25 14:40:18 +00:00
parent 59bcbba6b4
commit fdbf0a5af8
11 changed files with 142 additions and 50 deletions
--- a/llvm/lib/Bitcode/Reader/BitstreamReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitstreamReader.cpp
@@ -261,6 +261,10 @@ unsigned BitstreamCursor::readRecord(unsigned AbbrevID,
    }

    // Otherwise, inform the streamer that we need these bytes in memory.
+    // Skip over tail padding first.  We can't do it later if this is a
+    // streaming memory object, since that could reallocate the storage that
+    // the blob pointer references.
+    JumpToBit(NewEnd);
    const char *Ptr = (const char*)
      BitStream->getBitcodeBytes().getPointer(CurBitPos/8, NumElts);

@@ -272,8 +276,6 @@ unsigned BitstreamCursor::readRecord(unsigned AbbrevID,
      for (; NumElts; --NumElts)
        Vals.push_back((unsigned char)*Ptr++);
    }
-    // Skip over tail padding.
-    JumpToBit(NewEnd);
  }

  return Code;