llvm-project

Author	SHA1	Message	Date
Rafael Espindola	69f5402b26	Use adjustRelaxExpr for tls relaxations too. This remove some EM_386 specific code from InputSection.cpp and opens the way for more relaxations. llvm-svn: 271814	2016-06-04 23:22:34 +00:00
Rafael Espindola	12dc446939	Fix implicit plt creation on aarch64. We were not handling page relative relocations. llvm-svn: 271798	2016-06-04 19:11:14 +00:00
Rafael Espindola	e37d13b9ec	Start adding tlsdesc support for aarch64. This is mostly extracted from http://reviews.llvm.org/D18960. The general idea for tlsdesc is that the two GD got entries are used for a function pointer and its argument. The dynamic linker sets both. In the non-dlopen case the dynamic linker sets the function to the identity and the argument to the offset in the tls block. All that the static linker has to do in the non-dlopen case is relocate the code to point to the got entries and create a dynamic relocation. The dlopen case is more complicated, but can be implemented in another patch. llvm-svn: 271569	2016-06-02 19:49:53 +00:00
George Rimar	f10c8290fa	[ELF] - Implemented support for test/binop relaxations from latest ABI. Patch implements next relaxation from latest ABI: "Convert memory operand of test and binop into immediate operand, where binop is one of adc, add, and, cmp, or, sbb, sub, xor instructions, when position-independent code is disabled." It is described in System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, B.2 "B.2 Optimize GOTPCRELX Relocations"). Differential revision: http://reviews.llvm.org/D20793 llvm-svn: 271405	2016-06-01 16:45:30 +00:00
Rafael Espindola	a8433c1d1b	Revert "bar" This reverts commit r271365. Sorry, wrong branch. llvm-svn: 271366	2016-06-01 06:15:22 +00:00
Rafael Espindola	74540516ef	bar llvm-svn: 271365	2016-06-01 06:13:54 +00:00
Rui Ueyama	8b972d221e	Simplify. NFC. llvm-svn: 271133	2016-05-28 18:40:38 +00:00
Rui Ueyama	406b469de4	Avoid doing binary search. MergedInputSection::getOffset is the busiest function in LLD if string merging is enabled and input files have lots of mergeable sections. It is usually the case when creating executable with debug info, so it is pretty common. The reason why it is slow is because it has to do faily complex computations. For non-mergeable sections, section contents are contiguous in output, so in order to compute an output offset, we only have to add the output section's base address to an input offset. But for mergeable strings, section contents are split for merging, so they are not contigous. We've got to do some lookups. We used to do binary search on the list of section pieces. It is slow because I think it's hostile to branch prediction. This patch replaces it with hash table lookup. Seems it's working pretty well. Below is "perf stat -r10" output when linking clang with debug info. In this case this patch speeds up about 4%. Before: 6584.153205 task-clock (msec) # 1.001 CPUs utilized ( +- 0.09% ) 238 context-switches # 0.036 K/sec ( +- 6.59% ) 0 cpu-migrations # 0.000 K/sec ( +- 50.92% ) 1,067,675 page-faults # 0.162 M/sec ( +- 0.15% ) 18,369,931,470 cycles # 2.790 GHz ( +- 0.09% ) 9,640,680,143 stalled-cycles-frontend # 52.48% frontend cycles idle ( +- 0.18% ) <not supported> stalled-cycles-backend 21,206,747,787 instructions # 1.15 insns per cycle # 0.45 stalled cycles per insn ( +- 0.04% ) 3,817,398,032 branches # 579.786 M/sec ( +- 0.04% ) 132,787,249 branch-misses # 3.48% of all branches ( +- 0.02% ) 6.579106511 seconds time elapsed ( +- 0.09% ) After: 6312.317533 task-clock (msec) # 1.001 CPUs utilized ( +- 0.19% ) 221 context-switches # 0.035 K/sec ( +- 4.11% ) 1 cpu-migrations # 0.000 K/sec ( +- 45.21% ) 1,280,775 page-faults # 0.203 M/sec ( +- 0.37% ) 17,611,539,150 cycles # 2.790 GHz ( +- 0.19% ) 10,285,148,569 stalled-cycles-frontend # 58.40% frontend cycles idle ( +- 0.30% ) <not supported> stalled-cycles-backend 18,794,779,900 instructions # 1.07 insns per cycle # 0.55 stalled cycles per insn ( +- 0.03% ) 3,287,450,865 branches # 520.799 M/sec ( +- 0.03% ) 72,259,605 branch-misses # 2.20% of all branches ( +- 0.01% ) 6.307411828 seconds time elapsed ( +- 0.19% ) Differential Revision: http://reviews.llvm.org/D20645 llvm-svn: 270999	2016-05-27 14:39:13 +00:00
Simon Atanasyan	84bb355c3a	[ELF][MIPS] Handle section symbol points to the .MIPS.options / .reginfo section MIPS .reginfo and .MIPS.options sections are consumed by the linker, and the linker produces a single output section. But it is possible that input files contain section symbol points to the corresponding input section. In case of generation a relocatable output we need to write such symbols to the output file. Fixes bug 27878. Differential Revision: http://reviews.llvm.org/D20688 llvm-svn: 270910	2016-05-26 20:46:01 +00:00
George Rimar	5c33b91bbe	[ELF] - Implemented optimization for R_X86_64_GOTPCREL relocation. System V Application Binary Interface AMD64 Architecture Processor Supplement Draft Version 0.99.8 (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-r249.pdf, B.2 "B.2 Optimize GOTPCRELX Relocations") introduces possible relaxations for R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX. That patch implements the next relaxation: mov foo@GOTPCREL(%rip), %reg => lea foo(%rip), %reg and also opens door for implementing all other ones. Implementation was suggested by Rafael Ávila de Espíndola with few additions and testcases by myself. Differential revision: http://reviews.llvm.org/D15779 llvm-svn: 270705	2016-05-25 14:31:37 +00:00
Rafael Espindola	bfffa94ea7	Fix crash in .eh_frame marker section. llvm-svn: 270563	2016-05-24 14:51:50 +00:00
Rafael Espindola	29da3e3577	Simplify. Thanks to Rui for the suggestion. llvm-svn: 270555	2016-05-24 12:17:11 +00:00
Rafael Espindola	fe3a2f1b81	Revert "Simplify. Thanks to Rui for the suggestion." This reverts commit r270551. Sorry, I commited the wrong branch :-( llvm-svn: 270554	2016-05-24 12:12:06 +00:00
Rafael Espindola	dba64b8ea4	Simplify. Thanks to Rui for the suggestion. llvm-svn: 270551	2016-05-24 11:53:15 +00:00
Rui Ueyama	0b9a90364b	Rename EHInputSection -> EhInputSection. llvm-svn: 270532	2016-05-24 04:19:20 +00:00
Rui Ueyama	f5febef249	Create a new file EhFrame.cpp and move code to read .eh_frame there. llvm-svn: 270526	2016-05-24 02:55:45 +00:00
Rui Ueyama	b91bf1a9a0	Do not split mergeable sections if they are gc'ed. Previously, mergeable section's constructors did more than just setting member variables; it split section contents into small pieces. It is not always computationally cheap task because if the section is a mergeable string section, it needs to scan the entire section to split them by NUL characters. If a section would be thrown away by GC, that cost ended up being a waste of time. It is going to be larger problem if the section is compressed -- the whole time to uncompress it and split it up is going to be a waste. Luckily, we can defer section splitting after GC. We just have to remember which offsets are in use during GC and apply that later. This patch implements it. Differential Revision: http://reviews.llvm.org/D20516 llvm-svn: 270455	2016-05-23 16:55:43 +00:00
Rui Ueyama	744d47ea05	Make file-local function file-local. NFC. llvm-svn: 270387	2016-05-23 00:45:54 +00:00
Rui Ueyama	518f1af04d	Split MergeInputSection's ctor. NFC. llvm-svn: 270386	2016-05-23 00:40:24 +00:00
Rui Ueyama	88abd9b300	Move splitInputSection from EHOutputSection to EHInputSection. llvm-svn: 270385	2016-05-22 23:53:00 +00:00
Rui Ueyama	34dc99e2c5	Store section contents to SectionPiece. NFC. So that we don't need to cut a slice when we use a SectionPiece. llvm-svn: 270348	2016-05-22 01:15:32 +00:00
Rui Ueyama	90fa3722d2	Simplify SplitInputSection::getRangeAndSize. This patch adds Size member to SectionPiece so that getRangeAndSize can just return a SectionPiece instead of a std::pair<SectionPiece *, uint_t>. Also renamed the function. llvm-svn: 270346	2016-05-22 00:41:38 +00:00
Rui Ueyama	3ea8727188	Define SectionPiece and use it instead of std::pair<uint_t, uint_t>. We were using std::pair to represents pieces of splittable section contents. It hurt readability because "first" and "second" are not meaningful. This patch give them names. One more thing is that piecewise liveness information is stored to the second element of the pair as a special value of output section offset. It was confusing, so I defiend a new bit, "Live", in the new struct. llvm-svn: 270340	2016-05-22 00:13:04 +00:00
Simon Atanasyan	1c980ca5aa	[ELF] Take into account offset in the output section when read addends for a non-alloc input section llvm-svn: 270328	2016-05-21 19:48:54 +00:00
Rafael Espindola	ebed1fe0de	Refactor R_RELAX_TLS_* value computation. This makes it explicit that each R_RELAX_TLS_* is equivalent to some other expression. With this I think we are at a sweet spot for how much is done in Target.cpp. I did experiment with moving all the value math out of it. It has the advantage that we know the final value in target independent code, but it gets quite verbose. llvm-svn: 270277	2016-05-20 21:23:52 +00:00
Rafael Espindola	50223310ba	Simplify a bit. NFC. llvm-svn: 270275	2016-05-20 21:14:06 +00:00
Rafael Espindola	74f3dbe438	Directly compute the right value for R_RELAX_TLS_GD_TO_IE. This avoid doing math in Target.cpp to compensate. llvm-svn: 270266	2016-05-20 20:09:35 +00:00
Rafael Espindola	8818ca69dc	Make tp offset computation target independent. This adds direct support for computing offsets from the thread pointer for both variants. Of the architectures we support, variant 1 is used only by aarch64 (but that doesn't seem to be documented anywhere.) llvm-svn: 270243	2016-05-20 17:41:09 +00:00
Simon Atanasyan	4e3a15c9f3	[ELF][MIPS] Rename R_MIPS_GOT_xxx relocation expression kinds New names reflect purpose of corresponding GOT entries better. Both expression types related to entries allocated in the 'local' part of MIPS GOT. R_MIPS_GOT_LOCAL_PAGE is for entries contain 'page' addresses. R_MIPS_GOT_LOCAL is for entries contain 'full' address. llvm-svn: 269597	2016-05-15 18:13:50 +00:00
Rafael Espindola	3e0b7837bf	Cache result when tail merging too. This speeds up a link of chromium with -O2 (but no icf,gc) from 1.940664632 to 1.925578119. llvm-svn: 268639	2016-05-05 16:12:25 +00:00
Peter Collingbourne	e29e142a10	ELF: Do not use -1 to mark pieces of merge sections as being tail merged. We were previously using an output offset of -1 for both GC'd and tail merged pieces. We need to distinguish these two cases in order to filter GC'd symbols from the symbol table -- we were previously asserting when we asked for the VA of a symbol pointing into a dead piece, which would end up asking the tail merging string table for an offset even though we hadn't initialized it properly. This patch fixes the bug by using an offset of -1 to exclusively mean GC'd pieces, using 0 for tail merges, and distinguishing the tail merge case from an offset of 0 by asking the output section whether it is tail merge. Differential Revision: http://reviews.llvm.org/D19953 llvm-svn: 268604	2016-05-05 04:10:12 +00:00
Rafael Espindola	ebb04b9eb6	Simplify handling of hint relocations. llvm-svn: 268501	2016-05-04 14:44:22 +00:00
Simon Atanasyan	5e85a1b5be	[ELF][MIPS] Fix typo in the comment. NFC. llvm-svn: 268486	2016-05-04 10:15:12 +00:00
Simon Atanasyan	add74f37f2	[ELF][MIPS] Read/write .MIPS.options section MIPS N64 ABI introduces .MIPS.options section which specifies miscellaneous options to be applied to an object/shared/executable file. LLVM as well as modern versions of GNU tools read and write the only type of the options - ODK_REGINFO. It is exact copy of .reginfo section used by O32 ABI. llvm-svn: 268485	2016-05-04 10:07:38 +00:00
Rui Ueyama	890ce0c188	Do not produce broken debug info. r267917 produces corrupted debug info because it didn't apply relocations to right offsets. llvm-svn: 267979	2016-04-29 03:21:08 +00:00
Rui Ueyama	2b6fb80384	Skip scanRelocs for non-alloc sections. Relocations against sections with no SHF_ALLOC bit are R_ABS relocations. Currently we are creating Relocations vector for them, but that is wasteful. This patch is to skip vector construction and to directly apply relocations in place. This patch seems to be pretty effective for large executables with debug info. r266158 (Rafael's patch to change the way how we apply relocations) caused a temporary performance degradation for such executables, but this patch makes it even faster than before. Time to link clang with debug info (output size is 1070 MB): before r266158: 15.312 seconds (0%) r266158: 17.301 seconds (+13.0%) Head: 16.484 seconds (+7.7%) w/patch: 13.166 seconds (-14.0%) Differential Revision: http://reviews.llvm.org/D19645 llvm-svn: 267917	2016-04-28 18:42:04 +00:00
Peter Collingbourne	676c7cd1ed	ELF: Move code to where it is used, and related cleanups. NFC. Differential Revision: http://reviews.llvm.org/D19490 llvm-svn: 267637	2016-04-26 23:52:44 +00:00
Rafael Espindola	6c75238aca	Call repl in getSymbolBody. NFC. Every caller was doing it. llvm-svn: 267603	2016-04-26 20:45:31 +00:00
Rui Ueyama	e12fd0fc2c	Fix link failure. llvm-svn: 267245	2016-04-22 22:59:22 +00:00
Rafael Espindola	0b9531c8e6	Bring r267164 back with a fix. The fix is to handle local symbols referring to SHF_MERGE sections. Original message: GC entries of SHF_MERGE sections. It is a fairly direct extension of the gc algorithm. For merge sections instead of remembering just a live bit, we remember which offsets were used. This reduces the .rodata sections in chromium from 9648861 to `9477472` bytes. llvm-svn: 267233	2016-04-22 22:09:35 +00:00
Rafael Espindola	46c039f2c0	Revert "GC entries of SHF_MERGE sections." This reverts commit r267164. Revert "Trying to fix the windows build." This reverts commit r267168. Debugging a bootstrap problem. llvm-svn: 267194	2016-04-22 19:31:35 +00:00
Rafael Espindola	a630380a0c	Trying to fix the windows build. llvm-svn: 267168	2016-04-22 17:10:28 +00:00
Rafael Espindola	caa831d85a	GC entries of SHF_MERGE sections. It is a fairly direct extension of the gc algorithm. For merge sections instead of remembering just a live bit, we remember which offsets were used. This reduces the .rodata sections in chromium from 9648861 to `9477472` bytes. llvm-svn: 267164	2016-04-22 16:46:08 +00:00
Rafael Espindola	197d6a882f	This reverts commit r267154 and r267161. It turns out that this will read data from the section to properly handle Elf_Rel implicit addends. Sorry for the noise. Original messages: Try to fix Windows lld build. Move getRelocTarget to ObjectFile. It doesn't use anything from the InputSection. llvm-svn: 267163	2016-04-22 16:39:59 +00:00
Rafael Espindola	ea4d177977	Move getRelocTarget to ObjectFile. It doesn't use anything from the InputSection. llvm-svn: 267154	2016-04-22 14:17:14 +00:00
Rafael Espindola	475dbf42e4	Simplify mips gp0 handling. In all currently supported cases this is a nop. llvm-svn: 266888	2016-04-20 17:20:49 +00:00
Rafael Espindola	58cd5db4ef	Simplify mips got handling. This avoids computing the address of a position in the got just to then subtract got->getva(). llvm-svn: 266831	2016-04-19 22:46:03 +00:00
Rafael Espindola	ece62b962e	Simplify handling of R_X86_64_TPOFF32. NFC. llvm-svn: 266609	2016-04-18 12:44:33 +00:00
Rafael Espindola	3f5d634c73	Have getRelExpr handle all cases on x86. This requires adding a few more expression types, but is already a small simplification. Having Writer.cpp know the exact expression will also allow further simplifications. llvm-svn: 266604	2016-04-18 12:07:13 +00:00
Rafael Espindola	22ef956a45	Change how we apply relocations. With this patch we use the first scan over the relocations to remember the information we found about them: will them be relaxed, will a plt be used, etc. With that the actual relocation application becomes much simpler. That is particularly true for the interfaces in Target.h. This unfortunately means that we now do two passes over relocations for non SHF_ALLOC sections. I think this can be solved by factoring out the code that scans a single relocation. It can then be used both as a scan that record info and for a dedicated direct relocation of non SHF_ALLOC sections. I also think it is possible to reduce the number of enum values by representing a target with just an OutputSection and an offset (which can be from the start or end). This should unblock adding features like relocation optimizations. llvm-svn: 266158	2016-04-13 01:40:19 +00:00

... 3 4 5 6 7 ...

408 Commits