------------------------------------------------------------------------
r325713 | sdardis | 2018-02-21 20:01:43 +0000 (Wed, 21 Feb 2018) | 5 lines
[mips][lld] Address post commit review nit.
Address @ruiu's post commit review comment about a value which is intended
to be a unsigned 32 bit integer as using uint32_t rather than unsigned.
------------------------------------------------------------------------
------------------------------------------------------------------------
r325647 | sdardis | 2018-02-20 23:49:17 +0000 (Tue, 20 Feb 2018) | 27 lines
[mips][lld] Spectre variant two mitigation for MIPSR2
This patch provides migitation for CVE-2017-5715, Spectre variant two,
which affects the P5600 and P6600. It implements the LLD part of
-z hazardplt. Like the Clang part of this patch, I have opted for that
specific option name in case alternative migitation methods are required
in the future.
The mitigation strategy suggested by MIPS for these processors is to use
hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard
barrier variants of the 'jalr' and 'jr' instructions respectively.
These instructions impede the execution of instruction stream until
architecturally defined hazards (changes to the instruction stream,
privileged registers which may affect execution) are cleared. These
instructions in MIPS' designs are not speculated past.
These instructions are defined by the MIPS32R2 ISA, so this mitigation
method is not compatible with processors which implement an earlier
revision of the MIPS ISA.
For LLD, this changes PLT stubs to use 'jalr.hb' and 'jr.hb'.
Reviewers: atanasyan, ruiu
Differential Revision: https://reviews.llvm.org/D43488
------------------------------------------------------------------------
llvm-svn: 327757
------------------------------------------------------------------------
r325651 | sdardis | 2018-02-21 00:05:05 +0000 (Wed, 21 Feb 2018) | 34 lines
[mips] Spectre variant two mitigation for MIPSR2
This patch provides mitigation for CVE-2017-5715, Spectre variant two,
which affects the P5600 and P6600. It provides the option
-mindirect-jump=hazard, which instructs the LLVM backend to replace
indirect branches with their hazard barrier variants.
This option is accepted when targeting MIPS revision two or later.
The migitation strategy suggested by MIPS for these processors is to
use two hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard
barrier variants of the 'jalr' and 'jr' instructions respectively.
These instructions impede the execution of instruction stream until
architecturally defined hazards (changes to the instruction stream,
privileged registers which may affect execution) are cleared. These
instructions in MIPS' designs are not speculated past.
These instructions are used with the option -mindirect-jump=hazard
when branching indirectly and for indirect function calls.
These instructions are defined by the MIPS32R2 ISA, so this mitigation
method is not compatible with processors which implement an earlier
revision of the MIPS ISA.
Implementation note: I've opted to provide this as an
-mindirect-jump={hazard,...} style option in case alternative
mitigation methods are required for other implementations of the MIPS
ISA in future, e.g. retpoline style solutions.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D43487
------------------------------------------------------------------------
llvm-svn: 327755
------------------------------------------------------------------------
r325653 | sdardis | 2018-02-21 00:06:53 +0000 (Wed, 21 Feb 2018) | 31 lines
[mips] Spectre variant two mitigation for MIPSR2
This patch provides mitigation for CVE-2017-5715, Spectre variant two,
which affects the P5600 and P6600. It implements the LLVM part of
-mindirect-jump=hazard. It is _not_ enabled by default for the P5600.
The migitation strategy suggested by MIPS for these processors is to use
hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard
barrier variants of the 'jalr' and 'jr' instructions respectively.
These instructions impede the execution of instruction stream until
architecturally defined hazards (changes to the instruction stream,
privileged registers which may affect execution) are cleared. These
instructions in MIPS' designs are not speculated past.
These instructions are used with the attribute +use-indirect-jump-hazard
when branching indirectly and for indirect function calls.
These instructions are defined by the MIPS32R2 ISA, so this mitigation
method is not compatible with processors which implement an earlier
revision of the MIPS ISA.
Performance benchmarking of this option with -fpic and lld using
-z hazardplt shows a difference of overall 10%~ time increase
for the LLVM testsuite. Certain benchmarks such as methcall show a
substantially larger increase in time due to their nature.
Reviewers: atanasyan, zoran.jovanovic
Differential Revision: https://reviews.llvm.org/D43486
------------------------------------------------------------------------
llvm-svn: 327751
------------------------------------------------------------------------
r325049 | rnk | 2018-02-13 12:47:49 -0800 (Tue, 13 Feb 2018) | 17 lines
[X86] Use EDI for retpoline when no scratch regs are left
Summary:
Instead of solving the hard problem of how to pass the callee to the indirect
jump thunk without a register, just use a CSR. At a call boundary, there's
nothing stopping us from using a CSR to hold the callee as long as we save and
restore it in the prologue.
Also, add tests for this mregparm=3 case. I wrote execution tests for
__llvm_retpoline_push, but they never got committed as lit tests, either
because I never rewrote them or because they got lost in merge conflicts.
Reviewers: chandlerc, dwmw2
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D43214
------------------------------------------------------------------------
llvm-svn: 325090
------------------------------------------------------------------------
r324645 | dwmw2 | 2018-02-08 12:06:05 -0800 (Thu, 08 Feb 2018) | 5 lines
[X86] Support 'V' register operand modifier
This allows the register name to be printed without the leading '%'.
This can be used for emitting calls to the retpoline thunks from inline
asm.
------------------------------------------------------------------------
llvm-svn: 325089
------------------------------------------------------------------------
r324449 | chandlerc | 2018-02-06 22:16:24 -0800 (Tue, 06 Feb 2018) | 15 lines
[x86/retpoline] Make the external thunk names exactly match the names
that happened to end up in GCC.
This is really unfortunate, as the names don't have much rhyme or reason
to them. Originally in the discussions it seemed fine to rely on aliases
to map different names to whatever external thunk code developers wished
to use but there are practical problems with that in the kernel it turns
out. And since we're discovering this practical problems late and since
GCC has already shipped a release with one set of names, we are forced,
yet again, to blindly match what is there.
Somewhat rushing this patch out for the Linux kernel folks to test and
so we can get it patched into our releases.
Differential Revision: https://reviews.llvm.org/D42998
------------------------------------------------------------------------
llvm-svn: 325088
Original commit message:
------------------------------------------------------------------------
r323155 | chandlerc | 2018-01-22 14:05:25 -0800 (Mon, 22 Jan 2018) | 133 lines
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.
When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
------------------------------------------------------------------------
llvm-svn: 324025
------------------------------------------------------------------------
r323155 | chandlerc | 2018-01-22 14:05:25 -0800 (Mon, 22 Jan 2018) | 133 lines
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.
When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
------------------------------------------------------------------------
llvm-svn: 324012
------------------------------------------------------------------------
r323915 | chandlerc | 2018-01-31 12:56:37 -0800 (Wed, 31 Jan 2018) | 17 lines
[x86] Make the retpoline thunk insertion a machine function pass.
Summary:
This removes the need for a machine module pass using some deeply
questionable hacks. This should address PR36123 which is a case where in
full LTO the memory usage of a machine module pass actually ended up
being significant.
We should revert this on trunk as soon as we understand and fix the
memory usage issue, but we should include this in any backports of
retpolines themselves.
Reviewers: echristo, MatzeB
Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D42726
------------------------------------------------------------------------
llvm-svn: 324009
------------------------------------------------------------------------
r323155 | chandlerc | 2018-01-22 14:05:25 -0800 (Mon, 22 Jan 2018) | 133 lines
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.
When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.
Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer
Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D41723
------------------------------------------------------------------------
llvm-svn: 324007
------------------------------------------------------------------------
r312509 | dannyb | 2017-09-04 19:17:42 -0700 (Mon, 04 Sep 2017) | 1 line
NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification
------------------------------------------------------------------------
llvm-svn: 319952
------------------------------------------------------------------------
r314733 | rsmith | 2017-10-02 15:43:36 -0700 (Mon, 02 Oct 2017) | 5 lines
PR33839: Fix -Wunused handling for structured binding declarations.
We warn about a structured binding declaration being unused only if none of its
bindings are used.
------------------------------------------------------------------------
llvm-svn: 319847
------------------------------------------------------------------------
r311734 | ruiu | 2017-08-24 16:51:40 -0700 (Thu, 24 Aug 2017) | 44 lines
[MACH-O] Fix the ASM code generated for __stub_helpers section
Patch by Patricio Villalobos.
I discovered that lld for darwin is generating the wrong code for lazy
bindings in the __stub_helper section (at least for osx 10.12). This is
the way i can reproduce this problem, using this program:
#include <stdio.h>
int main(int argc, char **argv) {
printf("C: printf!\n");
puts("C: puts!\n");
return 0;
}
Then I link it using i have tested it in 3.9, 4.0 and 4.1 versions:
$ clang -c hello.c
$ lld -flavor darwin hello.o -o h1 -lc
When i execute the binary h1 the system gives me the following error:
C: printf!
dyld: lazy symbol binding failed:
BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB
has segment 4 which is too large (0..3)
dyld: BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB has segment 4 which is too
large (0..3)
Trace/BPT trap: 5
Investigating the code, it seems that the problem is that the asm code
generated in the file StubPass.cpp, specifically in the line 323,when it
adds, what it seems an arbitrary number (12) to the offset into the lazy
bind opcodes section, but it should be calculated depending on the
MachONormalizedFileBinaryWrite::lazyBindingInfo result.
I confirmed this bug by patching the code manually in the binary and
writing the right offset in the asm code (__stub_helper).
This patch fixes the content of the atom that contains the assembly code
when the offset is known.
Differential Revision: https://reviews.llvm.org/D35387
------------------------------------------------------------------------
llvm-svn: 319825
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding. This caused comparison with wrong value.
This bug looks to be introduced by r308080 (llvm 5.0). The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.
The bug is fixed by
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315889 91177308-0d34-0410-b5e6-96231b3b80d8
in llvm 6.0.
This patch is largely the same as the fix in llvm 6.0 except
one minor adjustment for the test case.
Reported-by: John Fastabend <john.fastabend@gmail.com>
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 319633
------------------------------------------------------------------------
r316035 | tnorthover | 2017-10-17 14:43:52 -0700 (Tue, 17 Oct 2017) | 6 lines
AArch64: account for possible frame index operand in compares.
If the address of a local is used in a comparison, AArch64 can fold the
address-calculation into the comparison via "adds". Unfortunately, a couple of
places (both hit in this one test) are not ready to deal with that yet and just
assume the first source operand is a register.
------------------------------------------------------------------------
llvm-svn: 319231
------------------------------------------------------------------------
r318848 | hahnfeld | 2017-11-22 09:15:20 -0800 (Wed, 22 Nov 2017) | 7 lines
Fix for OMP doacross implementation on Power
Power has a weak consistency model so we need memory barriers to
make writes (both from runtime and from user code) available for
all threads.
Differential Revision: https://reviews.llvm.org/D40175
------------------------------------------------------------------------
llvm-svn: 319057
------------------------------------------------------------------------
r316106 | labath | 2017-10-18 11:52:16 -0700 (Wed, 18 Oct 2017) | 4 lines
lldb-server tests: Fix undefined behavior
We were creating a StringRef pointing to a temporary string. Problem manifested
itself when running the test on osx.
------------------------------------------------------------------------
llvm-svn: 319035
------------------------------------------------------------------------
r318788 | mcrosier | 2017-11-21 10:08:34 -0800 (Tue, 21 Nov 2017) | 16 lines
[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as *having* side effects.
This partially reverts r298851. The the underlying issue is that we don't
currently model the dependency between mrs (read system register) and
msr (write system register) instructions.
Something like the below should never be reordered:
msr TPIDR_EL0, x0 ;; set thread pointer
mrs x8, TPIDR_EL0 ;; read thread pointer
but was being reordered after r298851. The functional part of the patch
that wasn't reverted needed to remain in place in order to not break
r299462.
PR35317
------------------------------------------------------------------------
llvm-svn: 318854
------------------------------------------------------------------------
r315086 | compnerd | 2017-10-06 11:06:59 -0700 (Fri, 06 Oct 2017) | 8 lines
Bitcode: add an auto-upgrade for LTO section name
The bitcode reader looks specifically for `__DATA, __objc_catlist` as a
section name. However, SVN r304661 removed the spaces (the two names
are functionally equivalent but do not compare equally
lexicographically). This causes compatibility issues. Add an
auto-upgrade path for removing the spaces as well as use the new name in
the LTO plugin.
------------------------------------------------------------------------
llvm-svn: 318851
------------------------------------------------------------------------
r313398 | steven_wu | 2017-09-15 14:12:14 -0700 (Fri, 15 Sep 2017) | 19 lines
[AutoUpgrade] Fix a compatibility issue with module flag
Summary:
After r304661, module flag to record objective-c image info section is
encoded without whitespaces after comma. The new name is equivalent to
the old one, except that when LTO a module built by old compiler and a
module built by a new compiler, it will fail with conflicting values.
Fix the issue by removing whitespaces in bitcode upgrade path.
rdar://problem/34416934
Reviewers: compnerd
Reviewed By: compnerd
Subscribers: mehdi_amini, hans, llvm-commits
Differential Revision: https://reviews.llvm.org/D37909
------------------------------------------------------------------------
llvm-svn: 318850
------------------------------------------------------------------------
r318039 | labath | 2017-11-13 06:03:17 -0800 (Mon, 13 Nov 2017) | 37 lines
Revert "[lldb] Use OrcMCJITReplacement rather than MCJIT as the underlying JIT for LLDB"
This commit really did not introduce any functional changes (for most
people) but it turns out it's not for the reason we thought it was.
The reason wasn't that Orc is a perfect drop-in replacement for MCJIT,
but it was because we were never using Orc in the first place, as it was
not initialized.
Orc's initialization relies on a global constructor in the LLVMOrcJIT.a.
Since this archive does not expose any symbols referenced from other
object files, it does not get linked into liblldb when linking against
llvm components statically. However, in an LLVM_LINK_LLVM_DYLIB=On
build, LLVMOrcJit.a is linked into libLLVM.so using --whole-archive, so
the global constructor does end up firing.
The result of using Orc jit is pr34194, where lldb fails to evaluate
even very simple expressions. This bug can be reproduced in
non-LLVM_LINK_LLVM_DYLIB builds by making sure Orc jit is linked into
liblldb, for example by #including
llvm/ExecutionEngine/OrcMCJITReplacement.h in IRExecutionUnit.cpp (and
adding OrcJIT as a dependency to the relevant CMakeLists.txt file). The
bug reproduces (at least) on linux and osx.
The root cause of the bug seems to be related to relocation processing.
It seems Orc processes relocations earlier than the system it is
replacing. This means the relocation processing happens before we have
had a chance to remap section load addresses to reflect their address in
the target process memory, so they end up pointing to locations in the
lldb's address space instead.
I am not sure whether this is a bug in Orc jit, or in how we are using
it from lldb, but in any case it is preventing us from using Orc right
now. Reverting this fixes LLVM_LINK_LLVM_DYLIB build, and makes it clear
that we are in fact *not* using Orc, and we never really were.
This reverts commit r279327.
------------------------------------------------------------------------
llvm-svn: 318845
------------------------------------------------------------------------
r315994 | ericwf | 2017-10-17 06:03:17 -0700 (Tue, 17 Oct 2017) | 18 lines
[libc++] Fix PR34898 - vector iterator constructors and assign method perform push_back instead of emplace_back.
Summary:
The constructors `vector(Iter, Iter, Alloc = Alloc{})` and `assign(Iter, Iter)` don't correctly perform EmplaceConstruction from the result of dereferencing the iterator. This results in them performing an additional and unneeded copy.
This patch addresses the issue by correctly using `emplace_back` in C++11 and newer.
There are also some bugs in our `insert` implementation, but those will be handled separately.
@mclow.lists We should probably merge this into 5.1, agreed?
Reviewers: mclow.lists, dlj, EricWF
Reviewed By: mclow.lists, EricWF
Subscribers: cfe-commits, mclow.lists
Differential Revision: https://reviews.llvm.org/D38757
------------------------------------------------------------------------
llvm-svn: 318837
------------------------------------------------------------------------
r312892 | ericwf | 2017-09-10 16:41:20 -0700 (Sun, 10 Sep 2017) | 10 lines
Fix PR34298 - Allow std::function with an incomplete return type.
This patch fixes llvm.org/PR34298. Previously libc++ incorrectly evaluated
the __invokable trait via the converting constructor `function(Tp)` [with Tp = std::function]
whenever the copy constructor or copy assignment operator
was required. This patch further constrains that constructor to short
circut before evaluating the troublesome SFINAE when `Tp` matches
std::function.
The original patch is from Alex Lorenz.
------------------------------------------------------------------------
llvm-svn: 318835
------------------------------------------------------------------------
r318289 | jdevlieghere | 2017-11-15 02:57:05 -0800 (Wed, 15 Nov 2017) | 14 lines
[DebugInfo] Fix potential CU mismatch for SubprogramScopeDIEs.
In constructAbstractSubprogramScopeDIE there can be a potential mismatch
between `this` and the CU of ContextDIE when a scope is shared between
two DISubprograms belonging to a different CU. In that case, `this` is
the CU that was specified in the IR, but the CU of ContextDIE is that of
the first subprogram that was emitted. This patch fixes the mismatch by
looking up the CU of ContextDIE, and switching to use that.
This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212)
Patch by Philip Craig!
Differential revision: https://reviews.llvm.org/D39981
------------------------------------------------------------------------
llvm-svn: 318542
------------------------------------------------------------------------
r318207 | sdardis | 2017-11-14 22:26:42 +0000 (Tue, 14 Nov 2017) | 18 lines
Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."
This adjusts the tests to hopfully pacify the
llvm-clang-x86_64-expensive-checks-win buildbot.
Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.
For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.
Reviewers: slthakur, atanasyan
Differential Revision: https://reviews.llvm.org/D35253
------------------------------------------------------------------------
llvm-svn: 318386
------------------------------------------------------------------------
r312748 | jroelofs | 2017-09-07 15:01:25 -0700 (Thu, 07 Sep 2017) | 10 lines
Fix validation of the -mthread-model flag in the Clang driver
The ToolChain class validates the -mthread-model flag in the constructor which
doesn't work correctly since the thread model methods are virtual methods. The
check is moved into Clang::ConstructJob() when constructing the internal
command line.
https://reviews.llvm.org/D37496
Patch by: Ian Tessier!
------------------------------------------------------------------------
llvm-svn: 318346
------------------------------------------------------------------------
r312043 | rnk | 2017-08-29 14:44:21 -0700 (Tue, 29 Aug 2017) | 25 lines
[cmake] Stop putting the revision info in LLVM_VERSION_STRING
Summary:
This reduces the number of build actions after a no-op commit from
thousands to about six, which should be acceptable. If six actions is
still too many, developers can disable the LLVM_APPEND_VC_REV cmake
option.
llvm-config.h is a widely included header that should rarely change.
Before this patch, it would change after every re-configure. Very few
users of llvm-config.h need to know the precise version, and those that
do can migrate to incorporating LLVM_REVISION as provided by
llvm/Support/VCSRevision.h.
This should bring LLVM back to the behavior that it had before r306858
from June 30 2017. Most LLVM tools will now print a version string like
"6.0.0svn" instead of "6.0.0-git-c40c2a23de4".
Fixes PR34308
Reviewers: pcc, rafael, hans
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D37272
------------------------------------------------------------------------
llvm-svn: 318344
------------------------------------------------------------------------
r310475 | belleyb | 2017-08-09 06:47:01 -0700 (Wed, 09 Aug 2017) | 28 lines
[Support] PR33388 - Fix formatv_object move constructor
formatv_object currently uses the implicitly defined move constructor,
but it is buggy. In typical use-cases, the problem doesn't show-up
because all calls to the move constructor are elided. Thus, the buggy
constructors are never invoked.
The issue especially shows-up when code is compiled using the
-fno-elide-constructors compiler flag. For instance, this is useful when
attempting to collect accurate code coverage statistics.
The exact issue is the following:
The Parameters data member is correctly moved, thus making the
parameters occupy a new memory location in the target
object. Unfortunately, the default copying of the Adapters blindly
copies the vector of pointers, leaving each of these pointers
referencing the parameters in the original object instead of the copied
one. These pointers quickly become dangling when the original object is
deleted. This quickly leads to crashes.
The solution is to update the Adapters pointers when performing a move.
The copy constructor isn't useful for format objects and can thus be
deleted.
This resolves PR33388.
Differential Revision: https://reviews.llvm.org/D34463
------------------------------------------------------------------------
llvm-svn: 318333
------------------------------------------------------------------------
r315578 | abataev | 2017-10-12 06:51:32 -0700 (Thu, 12 Oct 2017) | 7 lines
[OPENMP] Fix PR34925: Fix getting thread_id lvalue for inlined regions
in C.
If we try to get the lvalue for thread_id variables in inlined regions,
we did not use the correct version of function. Fixed this bug by adding
overrided version of the function getThreadIDVariableLValue for inlined
regions.
------------------------------------------------------------------------
llvm-svn: 318315
------------------------------------------------------------------------
r315464 | abataev | 2017-10-11 08:29:40 -0700 (Wed, 11 Oct 2017) | 5 lines
[OPENMP] Fix PR34916: Crash on mixing taskloop|tasks directives.
If both taskloop and task directives are used at the same time in one
program, we may ran into the situation when the particular type for task
directive is reused for taskloop directives. Patch fixes this problem.
------------------------------------------------------------------------
llvm-svn: 318233
------------------------------------------------------------------------
r310905 | rnk | 2017-08-14 18:17:47 -0700 (Mon, 14 Aug 2017) | 11 lines
Avoid PointerIntPair of constexpr EvalInfo structs
They are stack allocated, so their alignment is not to be trusted.
32-bit MSVC only guarantees 4 byte stack alignment, even though alignof
would tell you otherwise. I tried fixing this with __declspec align, but
that apparently upsets GCC. Hopefully this version will satisfy all
compilers.
See PR32018 for some info about the mingw issues.
Should supercede https://reviews.llvm.org/D34873
------------------------------------------------------------------------
------------------------------------------------------------------------
r310994 | chandlerc | 2017-08-16 00:22:49 -0700 (Wed, 16 Aug 2017) | 6 lines
Fix a UBSan failure where this boolean was copied when uninitialized.
When r310905 moved the pointer and bool out of a PointerIntPair, it made
them end up uninitialized and caused UBSan failures when copying the
uninitialized boolean. However, making the pointer be null should avoid
the reference to the boolean entirely.
------------------------------------------------------------------------
llvm-svn: 318225
------------------------------------------------------------------------
r315310 | sdardis | 2017-10-10 06:34:45 -0700 (Tue, 10 Oct 2017) | 22 lines
[mips] Partially fix PR34391
Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser
which also rendered the operand to the instruction. In some cases the
general parser could construct an MCExpr which was not a MCConstantExpr
which MipsAsmParser was expecting.
Address this by altering the special handling to cope with unexpected inputs
and fine-tune the handling of cases where an register name that is not
available in the current ABI is regarded as not a match for the custom parser
but also not as an outright error.
Also enforces the binutils restriction that only constants are accepted.
This partially resolves PR34391.
Thanks to Ed Maste for reporting the issue!
Reviewers: nitesh.jain, arichardson
Differential Revision: https://reviews.llvm.org/D37476
------------------------------------------------------------------------
llvm-svn: 318192
------------------------------------------------------------------------
r317204 | sdardis | 2017-11-02 05:47:22 -0700 (Thu, 02 Nov 2017) | 15 lines
[mips] Use register scavenging with MSA.
MSA stores and loads to the stack are more likely to require an
emergency GPR spill slot due to the smaller offsets available
with those instructions.
Handle this by overestimating the size of the stack by determining
the largest offset presuming that all callee save registers are
spilled and accounting of incoming arguments when determining
whether an emergency spill slot is required.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D39056
------------------------------------------------------------------------
------------------------------------------------------------------------
r318172 | sdardis | 2017-11-14 11:11:45 -0800 (Tue, 14 Nov 2017) | 5 lines
[mips] Simplify test for 5.0.1 (NFC)
Simplify testing that an emergency spill slot is used when MSA
is used so that it can be included in the 5.0.1 release.
------------------------------------------------------------------------
llvm-svn: 318191
------------------------------------------------------------------------
r317470 | sdardis | 2017-11-06 02:50:04 -0800 (Mon, 06 Nov 2017) | 12 lines
[mips] Fix PR35140
Mark all symbols involved with TLS relocations as being TLS symbols.
This resolves PR35140.
Thanks to Alex Crichton for reporting the issue!
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D39591
------------------------------------------------------------------------
llvm-svn: 318188
------------------------------------------------------------------------
r314798 | sdardis | 2017-10-03 06:45:49 -0700 (Tue, 03 Oct 2017) | 9 lines
[mips] Enable spilling and reloading of the dsp register set.
The dsp register class is an alias of the gpr register class, so
we have to define instructions for spilling and reloading.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D38038
------------------------------------------------------------------------
llvm-svn: 318183
------------------------------------------------------------------------
r310543 | pcc | 2017-08-09 18:07:44 -0700 (Wed, 09 Aug 2017) | 9 lines
Linker: Create a function declaration when moving a non-prevailing alias of function type.
We were previously creating a global variable of function type,
which is invalid IR. This issue was exposed by r304690, in which we
started asserting that global variables were of a valid type.
Fixes PR33462.
Differential Revision: https://reviews.llvm.org/D36438
------------------------------------------------------------------------
llvm-svn: 318181
------------------------------------------------------------------------
r317115 | jlpeyton | 2017-11-01 12:44:42 -0700 (Wed, 01 Nov 2017) | 19 lines
[OpenMP] Fix race condition in omp_init_lock
This is a partial fix for bug 34050.
This prevents callers of omp_set_lock (which does not hold __kmp_global_lock)
from ever seeing an uninitialized version of __kmp_i_lock_table.table.
It does not solve a use-after-free race condition if omp_set_lock obtains a
pointer to __kmp_i_lock_table.table before it is updated and then attempts to
dereference afterwards. That race is far less likely and can be handled in a
separate patch.
The unit test usually segfaults on the current trunk revision. It passes with
the patch.
Patch by Adam Azarchs
Differential Revision: https://reviews.llvm.org/D39439
------------------------------------------------------------------------
llvm-svn: 318178
------------------------------------------------------------------------
r316452 | jlpeyton | 2017-10-24 09:10:09 -0700 (Tue, 24 Oct 2017) | 9 lines
Disable threadprivate data cleanup if runtime is terminating
The problem is due to the runtime's threadprivate cleanup code which tries to
access data that was already destroyed by one of the root threads.
__kmp_init_gtid is used as a checker here since it is set to false before actual
resource cleanup is done in __kmp_cleanup().
Patch by Hansang Bae
------------------------------------------------------------------------
llvm-svn: 318176
------------------------------------------------------------------------
r311269 | jlpeyton | 2017-08-19 16:53:36 -0700 (Sat, 19 Aug 2017) | 8 lines
Use va_copy instead of __va_copy to fix building libomp against musl libc
Fixes https://bugs.llvm.org/show_bug.cgi?id=34040
Patch by Peter Levine
Differential Revision: https://reviews.llvm.org/D36343
------------------------------------------------------------------------
llvm-svn: 318175
------------------------------------------------------------------------
r309875 | jlpeyton | 2017-08-02 13:06:32 -0700 (Wed, 02 Aug 2017) | 11 lines
Move lock acquire/release functions in task deque cleanup code
The original locations can be reached without initializing the lock variable
(td_deque_lock), so it is potentially unsafe. It is guaranteed that the lock
is initialized if the deque (td_deque) is not NULL, and lock functions can be
safely called.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D36017
------------------------------------------------------------------------
llvm-svn: 318171
------------------------------------------------------------------------
r315611 | abataev | 2017-10-12 13:03:39 -0700 (Thu, 12 Oct 2017) | 5 lines
[OPENMP] Fix PR34927: Emit initializer for reduction array with declare
reduction.
If the reduction is an array or an array section and reduction operation
is declare reduction without initializer, it may lead to crash.
------------------------------------------------------------------------
llvm-svn: 318120
------------------------------------------------------------------------
r315586 | abataev | 2017-10-12 08:18:41 -0700 (Thu, 12 Oct 2017) | 5 lines
[OPENMP] Fix PR34926: Fix handling of the array sections passed as
function params.
Codegen could crash if the array section base expression is the
function parameter.
------------------------------------------------------------------------
llvm-svn: 318118
------------------------------------------------------------------------
r313675 | rcraik | 2017-09-19 14:04:23 -0700 (Tue, 19 Sep 2017) | 9 lines
[OpenMP] fix seg-faults printing diagnostics with invalid ordered(n) values
When the value specified for n in ordered(n) is larger than the number of loops a segmentation fault can occur in one of two ways when attempting to print out a diagnostic for an associated depend(sink : vec):
1) The iteration vector vec contains less than n items
2) The iteration vector vec contains a variable that is not a loop control variable
This patch addresses both of these issues.
Differential Revision: https://reviews.llvm.org/D38049
------------------------------------------------------------------------
llvm-svn: 318116
------------------------------------------------------------------------
r312292 | abataev | 2017-08-31 16:06:52 -0700 (Thu, 31 Aug 2017) | 8 lines
[OPENMP] Fix for PR34398: assert with random access iterator if the
step>1.
If the loop is a loot with random access iterators and the iteration
construct is represented it += n, then the compiler crashed because of
reusing of the same MaterializedTemporaryExpr around N. Patch fixes it
by using the expression as written, without any special kind of
wrappings.
------------------------------------------------------------------------
llvm-svn: 318113
------------------------------------------------------------------------
r311777 | abataev | 2017-08-25 08:43:55 -0700 (Fri, 25 Aug 2017) | 5 lines
[OPENMP] Fix for PR34321: ustom OpenMP reduction in C++ template causes
SEGFAULT at compile time
Compiler crashed when tried to rebuild non-template expression in
dependent context.
------------------------------------------------------------------------
llvm-svn: 318107
------------------------------------------------------------------------
r311013 | abataev | 2017-08-16 08:58:46 -0700 (Wed, 16 Aug 2017) | 7 lines
[OPENMP] Fix for PR28581: OpenMP linear clause - wrong results.
If worksharing construct has at least one linear item, an implicit
synchronization point must be emitted to avoid possible conflict with
the loading/storing values to the original variables. Added implicit
barrier if the linear item is found before actual start of the
worksharing construct.
------------------------------------------------------------------------
llvm-svn: 318105
------------------------------------------------------------------------
r309288 | erichkeane | 2017-07-27 09:28:20 -0700 (Thu, 27 Jul 2017) | 32 lines
Fix double destruction of objects when OpenMP construct is canceled
When an omp for loop is canceled the constructed objects are being destructed
twice.
It looks like the desired code is:
{
Obj o;
If (cancelled) branch-through-cleanups to cancel.exit.
}
[cleanups]
cancel.exit:
__kmpc_for_static_fini
br cancel.cont (*)
cancel.cont:
__kmpc_barrier
return
The problem seems to be the branch to cancel.cont is currently also going
through the cleanups calling them again. This change just does a direct branch
instead.
Patch By: michael.p.rice@intel.com
Differential Revision: https://reviews.llvm.org/D35854
------------------------------------------------------------------------
llvm-svn: 318099
------------------------------------------------------------------------
r316824 | haicheng | 2017-10-27 19:27:14 -0700 (Fri, 27 Oct 2017) | 7 lines
[ConstantFold] Fix a crash when folding a GEP that has vector index
LLVM crashes when factoring out an out-of-bound index into preceding dimension
and the preceding dimension uses vector index. Simply bail out now when this
case happens.
Differential Revision: https://reviews.llvm.org/D38677
------------------------------------------------------------------------
llvm-svn: 318096
------------------------------------------------------------------------
r312693 | marshall | 2017-09-06 21:19:32 -0700 (Wed, 06 Sep 2017) | 1 line
Add even more string_view tests. These found some bugs in the default parameter value for rfind/find_last_of/find_last_not_of
------------------------------------------------------------------------
llvm-svn: 318088
------------------------------------------------------------------------
r315485 | spatel | 2017-10-11 11:24:21 -0700 (Wed, 11 Oct 2017) | 7 lines
[x86] avoid infinite loop from SoftenFloatOperand (PR34866)
Legalization of fp128 assumes things that we should have asserts for,
so that's another potential improvement.
Differential Revision: https://reviews.llvm.org/D38771
------------------------------------------------------------------------
llvm-svn: 316607
------------------------------------------------------------------------
r314898 | dylanmckay | 2017-10-04 23:37:22 +1300 (Wed, 04 Oct 2017) | 6 lines
[AVR] Implement LPMWRdZ pseudo-instruction's expansion.
FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should
refactor a bit and unify the two
Patch by Gerdo Erdi.
------------------------------------------------------------------------
llvm-svn: 315836
------------------------------------------------------------------------
r314891 | dylanmckay | 2017-10-04 22:51:28 +1300 (Wed, 04 Oct 2017) | 8 lines
[AVR] Insert JMP for long branches
Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.
Patch by Thomas Backman.
------------------------------------------------------------------------
llvm-svn: 315833
------------------------------------------------------------------------
r314890 | dylanmckay | 2017-10-04 22:51:21 +1300 (Wed, 04 Oct 2017) | 16 lines
[AVR] Fix displacement overflow for LDDW/STDW
In some cases, the code generator attempts to generate instructions such as:
lddw r24, Y+63
which expands to:
ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary
This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.
Patch by Thomas Backman.
------------------------------------------------------------------------
llvm-svn: 315832
------------------------------------------------------------------------
r312357 | davide | 2017-09-01 12:54:08 -0700 (Fri, 01 Sep 2017) | 9 lines
[TTI] Fix getGEPCost() for geps with a single operand.
Previously this would sporadically crash as TargetType
was never initialized. We special-case the single-operand
case returning earlier and trying to mimic the behaviour of
isLegalAddressingMode as closely as possible.
Differential Revision: https://reviews.llvm.org/D37277
------------------------------------------------------------------------
llvm-svn: 315663
------------------------------------------------------------------------
r314513 | hahnfeld | 2017-09-29 06:53:03 -0700 (Fri, 29 Sep 2017) | 3 lines
[test] Fix uninitialized memory in omp_taskloop_grainsize.c
result was never initialized to zero which sometimes failed the test.
------------------------------------------------------------------------
llvm-svn: 315353
------------------------------------------------------------------------
r309979 | mgorny | 2017-08-03 12:41:33 -0700 (Thu, 03 Aug 2017) | 16 lines
[test] Fix clang library dir in LD_LIBRARY_PATH For stand-alone build
Prepend the clang library directory (determined using SHLIBDIR, alike
in clang) to the LD_LIBRARY_PATH to ensure that just-built clang
libraries will be used instead of a previous installed version.
When a stand-alone build is performed, LLVM_LIBS_DIR contains the path
to installed LLVM library directory. The same directory frequently
contains a previously installed version of clang. SHLIBDIR, on the other
hand, is always the build-tree directory, and therefore contains
the freshly built clang libraries.
In a non-stand-alone build, both paths will be the same and therefore
including them both will not cause any issues.
Differential Revision: https://reviews.llvm.org/D30155
------------------------------------------------------------------------
llvm-svn: 315224
------------------------------------------------------------------------
r313366 | ctopper | 2017-09-15 10:09:03 -0700 (Fri, 15 Sep 2017) | 9 lines
[X86] Don't create i64 constants on 32-bit targets when lowering v64i1 constant build vectors
When handling a v64i1 build vector of constants on 32-bit targets we were creating an illegal i64 constant that we then bitcasted back to v64i1. We need to instead create two 32-bit constants, bitcast them to v32i1 and concat the result. We should also take care to handle the halves being all zeros/ones after the split.
This patch splits the build vector and then recursively lowers the two pieces. This allows us to handle the all ones and all zeros cases with minimal effort. Ideally we'd just do the split and concat, and let lowering get called again on the new nodes, but getNode has special handling for CONCAT_VECTORS that reassembles the pieces back into a single BUILD_VECTOR. Hopefully the two temporary BUILD_VECTORS we had to create to do this that don't get returned don't cause any issues.
Fixes PR34605.
Differential Revision: https://reviews.llvm.org/D37858
------------------------------------------------------------------------
llvm-svn: 315198
------------------------------------------------------------------------
r314891 | dylanmckay | 2017-10-04 22:51:28 +1300 (Wed, 04 Oct 2017) | 8 lines
[AVR] Insert JMP for long branches
Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.
Patch by Thomas Backman.
------------------------------------------------------------------------
llvm-svn: 314892
------------------------------------------------------------------------
r312706 | anng | 2017-09-07 01:43:56 -0700 (Thu, 07 Sep 2017) | 14 lines
[LLD] Fix padding of .eh_frame when in executable segment
The default padding for an executable segment is the target trap
instruction which for x86_64 is 0xCC. However, the .eh_frame section
requires the padding to be zero. The code that writes the .eh_frame
section assumes that its segment is zero initialized and does not
explicitly write the zero padding. This does not work when the .eh_frame
section is in the executable segment (for example when using
-no-rosegment).
This patch changes the .eh_frame writing code to explicitly write the
zero padding.
Differential Revision: https://reviews.llvm.org/D37462
------------------------------------------------------------------------
llvm-svn: 314861
------------------------------------------------------------------------
r313741 | grimar | 2017-09-20 02:27:41 -0700 (Wed, 20 Sep 2017) | 9 lines
[ELF] - Fix segfault when processing .eh_frame.
Its a PR34648 which was a segfault that happened because
we stored pointers to elements in DenseMap.
When DenseMap grows such pointers are invalidated.
Solution implemented is to keep elements by pointer
and not by value.
Differential revision: https://reviews.llvm.org/D38034
------------------------------------------------------------------------
llvm-svn: 314860
[AArch64] Fix bug in store of vector 0 DAGCombine.
Summary:
Avoid using XZR/WZR directly as operands to split stores of zero
vectors. Doing so can lead to the XZR/WZR being used by an instruction
that doesn't allow it (e.g. add).
Fixes bug 34674.
Reviewers: t.p.northover, efriedma, MatzeB
Subscribers: aemerson, rengolin, javed.absar, mcrosier, eraman, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D38146
PR34695.
llvm-svn: 314796
------------------------------------------------------------------------
r313392 | ctopper | 2017-09-15 13:27:59 -0700 (Fri, 15 Sep 2017) | 7 lines
[X86] Disable _mm512_maskz_set1_epi64 intrinsic on 32-bit targets to prevent a backend isel failure.
The __builtin_ia32_pbroadcastq512_mem_mask we were previously trying to use in 32-bit mode is not implemented in the x86 backend and causes isel to fail in release builds. In debug builds it fails even earlier during legalization with an llvm_unreachable.
While there add the missing test case for this intrinsic for this for 64-bit mode.
This fixes PR34631. D37668 should be able to recover this for 32-bit mode soon. But I wanted to fix the crash ahead of that.
------------------------------------------------------------------------
llvm-svn: 314569
------------------------------------------------------------------------
r311951 | adrian | 2017-08-28 16:07:43 -0700 (Mon, 28 Aug 2017) | 6 lines
Fix a logic error in DwarfExpression::addMachineReg()
This fixes PR34323 and thus splitting undescribable registers into
smaller, describable sub-registers.
https://bugs.llvm.org/show_bug.cgi?id=34323
------------------------------------------------------------------------
------------------------------------------------------------------------
r312038 | joerg | 2017-08-29 14:18:07 -0700 (Tue, 29 Aug 2017) | 2 lines
Simplify test case, so that it works for both trunk and release-5.0.
------------------------------------------------------------------------
llvm-svn: 314567
------------------------------------------------------------------------
r312348 | matze | 2017-09-01 11:36:26 -0700 (Fri, 01 Sep 2017) | 39 lines
LiveIntervalAnalysis: Fix alias regunit reserved definition
A register in CodeGen can be marked as reserved: In that case we
consider the register always live and do not use (or rather ignore)
kill/dead/undef operand flags.
LiveIntervalAnalysis however tracks liveness per register unit (not per
register). We already needed adjustments for this in r292871 to deal
with super/sub registers. However I did not look at aliased register
there. Looking at ARM:
FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV
(regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit
(FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers.
This shared register unit was previously considered non-reserved,
however given that we uses of the reserved FPSCR potentially violate
some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV
reserved too and stop tracking liveness for it.
This patch:
- Defines a register unit as reserved when: At least for one root
register, the root register and all its super registers are reserved.
- Adjust LiveIntervals::computeRegUnitRange() for new reserved
definition.
- Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way
of testing.
- Stop computing LiveRanges for reserved register units in HMEditor even
with UpdateFlags enabled.
- Skip verification of uses of reserved reg units in the machine
verifier (this usually didn't happen because there would be no cached
liverange but there is no guarantee for that and I would run into this
case before the HMEditor tweak, so may as well fix the verifier too).
Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today;
aliased registers are rarely used, the only other cases are hexagons
P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved
registers in an alias.
Differential Revision: https://reviews.llvm.org/D37356
------------------------------------------------------------------------
llvm-svn: 314565
------------------------------------------------------------------------
r314252 | gberry | 2017-09-26 14:40:46 -0700 (Tue, 26 Sep 2017) | 12 lines
[AArch64][Falkor] Fix bug in falkor prefetcher fix pass.
Summary:
In rare cases, loads that don't get prefetched that were marked as
strided loads could cause a crash if they occurred in a loop with other
colliding loads.
Reviewers: mcrosier
Subscribers: aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D38261
------------------------------------------------------------------------
llvm-svn: 314555
------------------------------------------------------------------------
r314251 | gberry | 2017-09-26 14:40:41 -0700 (Tue, 26 Sep 2017) | 16 lines
[AArch64][Falkor] Fix correctness bug in falkor prefetcher fix pass and correct some opcode tag computations.
Summary:
This addresses a correctness bug for LD[1234]*_POST opcodes that have
the prefetcher fix applied to them: the base register was not being
written back from the temp after being incremented, so it would appear
to never be incremented.
Also, fix some opcode tag computations based on some updated HW details
to get better tag avoidance and thus better prefetcher performance.
Reviewers: mcrosier
Subscribers: aemerson, rengolin, javed.absar, kristof.beyls
Differential Revision: https://reviews.llvm.org/D38256
------------------------------------------------------------------------
llvm-svn: 314554
------------------------------------------------------------------------
r311599 | gberry | 2017-08-23 14:11:28 -0700 (Wed, 23 Aug 2017) | 4 lines
[AArch64][Falkor] Fix bug in Falkor HWPF tag collision avoidance
LDPDi was incorrectly marked as ignoring the destination register in the
prefetcher tag.
------------------------------------------------------------------------
llvm-svn: 314553
------------------------------------------------------------------------
r312447 | hfinkel | 2017-09-03 10:18:25 -0700 (Sun, 03 Sep 2017) | 12 lines
[CodeGen] Treat all vector fields as mayalias
Because it is common to treat vector types as an array of their elements, or
even some other type that's not the element type, and thus index into them, we
can't use struct-path TBAA for these accesses. Even though we already treat all
vector types as equivalent to 'char', we were using field-offset information
for them with TBAA, and this renders undefined the intra-value indexing we
intend to allow. Note that, although 'char' is universally aliasing, with path
TBAA, we can still differentiate between access to s.a and s.b in
struct { char a, b; } s;. We can't use this capability as-is for vector types.
Fixes PR33967.
------------------------------------------------------------------------
llvm-svn: 314476
------------------------------------------------------------------------
r312651 | jroelofs | 2017-09-06 10:09:25 -0700 (Wed, 06 Sep 2017) | 23 lines
Fix ARM bare metal driver to support atomics
The new bare metal support only supports the single thread model. This causes
the builtin atomic functions (e.g.: __atomic_fetch_add) to not generate
thread-safe assembly for these operations, which breaks our firmware. We target
bare metal, and need to atomically modify variables in our interrupt routines,
and task threads.
Internally, the -mthread-model flag determines whether to lower or expand
atomic operations (see D4984).
This change removes the overridden thread model methods, and instead relies on
the base ToolChain class to validate the thread model (which already includes
logic to validate single thread model support). If the single thread model is
required, the -mthread-model flag will have to be provided.
As a workaround "-mthread-model posix" could be provided, but it only works due
to a bug in the validation of the -mthread-model flag (separate patch coming to
fix this).
https://reviews.llvm.org/D37493
Patch by: Ian Tessier!
------------------------------------------------------------------------
llvm-svn: 314464
------------------------------------------------------------------------
r311921 | joerg | 2017-08-28 22:20:47 +0200 (Mon, 28 Aug 2017) | 16 lines
Fix ARMv4 support
ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.
Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.
Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.
Differential Revision: https://reviews.llvm.org/D37030
------------------------------------------------------------------------
llvm-svn: 314417
------------------------------------------------------------------------
r314180 | dylanmckay | 2017-09-26 13:51:03 +1300 (Tue, 26 Sep 2017) | 7 lines
[AVR] When lowering shifts into loops, put newly generated MBBs in the same
spot as the original MBB
Discovered in avr-rust/rust#62https://github.com/avr-rust/rust/issues/62
Patch by Gergo Erdi.
------------------------------------------------------------------------
llvm-svn: 314383
------------------------------------------------------------------------
r314183 | dylanmckay | 2017-09-26 15:07:54 +1300 (Tue, 26 Sep 2017) | 3 lines
[AVR] Fix the build after setting alignment to 1 in r314179
Changing all types to be byte-aligned broke a small number of tests.
------------------------------------------------------------------------
llvm-svn: 314382
------------------------------------------------------------------------
r314354 | dylanmckay | 2017-09-28 11:09:01 +1300 (Thu, 28 Sep 2017) | 3 lines
[AVR] Update data layout to match current LLVM trunk
The data layout was changed in r314179 to fix atomic loads and stores.
------------------------------------------------------------------------
llvm-svn: 314381
------------------------------------------------------------------------
r314179 | dylanmckay | 2017-09-26 13:45:27 +1300 (Tue, 26 Sep 2017) | 11 lines
[AVR] Use 1-byte alignment for all data types
This was an oversight in the original backend data layout.
The AVR architecture does not have the concept of unaligned loads - all
loads/stores from all addresses are aligned to one byte.
Discovered in avr-rust issue #64https://github.com/avr-rust/rust/issues/64
Patch By Gergo Erdi.
------------------------------------------------------------------------
llvm-svn: 314379
------------------------------------------------------------------------
r312905 | dylanmckay | 2017-09-11 22:32:51 +1200 (Mon, 11 Sep 2017) | 10 lines
[AVR] Enable the '__do_copy_data' function
Also enables '__do_clear_bss'.
These functions are automaticalled called by the CRT if they are
declared.
We need these to be called otherwise RAM will start completely
uninitialised, even though we need to copy RAM variables from progmem to
RAM.
------------------------------------------------------------------------
llvm-svn: 314356
------------------------------------------------------------------------
r312337 | nha | 2017-09-01 09:56:32 -0700 (Fri, 01 Sep 2017) | 12 lines
AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states
Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D36193
------------------------------------------------------------------------
llvm-svn: 314327
------------------------------------------------------------------------
r312337 | nha | 2017-09-01 09:56:32 -0700 (Fri, 01 Sep 2017) | 12 lines
AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states
Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D36193
------------------------------------------------------------------------
llvm-svn: 314324
------------------------------------------------------------------------
r313334 | tstellar | 2017-09-14 19:25:22 -0700 (Thu, 14 Sep 2017) | 15 lines
merge-request.sh: Update to use new "Fixed by Commit(s)" field
Summary:
This will be used instead of the url field to track which commits need
to be merged.
This patch also drops support for version 1.x of the bugzilla CLI tool.
Reviewers: hansw, hans
Reviewed By: hans
Subscribers: hans, llvm-commits
Differential Revision: https://reviews.llvm.org/D37786
------------------------------------------------------------------------
llvm-svn: 313337
------------------------------------------------------------------------
r312285 | ctopper | 2017-08-31 14:39:23 -0700 (Thu, 31 Aug 2017) | 11 lines
[X86] Don't pull carry through X86ISD::ADD carryin, -1 if we can't guranteed we're really using the carry flag from the add.
Prior to this patch we had a DAG combine that tried to bypass an X86ISD::ADD with -1 being added to the carry flag of some previous operation. We would then pass the carry flag directly to user.
But this is only safe if the user is looking for the carry flag and not the zero flag.
So we need to only do this combine in a context where we know what flag the consumer is using.
Fixes PR34381.
Differential Revision: https://reviews.llvm.org/D37317
------------------------------------------------------------------------
llvm-svn: 312333
------------------------------------------------------------------------
r312022 | hans | 2017-08-29 11:41:00 -0700 (Tue, 29 Aug 2017) | 10 lines
[DAG] Bound loop dependence check in merge optimization.
The loop dependence check looks for dependencies between store merge
candidates not captured by the chain sub-DAG doing a check of
predecessors which may be very large. Conservatively bound number of
nodes checked for compilation time. (Resolves PR34326).
Landing on behalf of Nirav Dave to unblock the 5.0.0 release.
Differential Revision: https://reviews.llvm.org/D37220
------------------------------------------------------------------------
llvm-svn: 312041
------------------------------------------------------------------------
r312008 | cbieneman | 2017-08-29 09:13:41 -0700 (Tue, 29 Aug 2017) | 7 lines
[IPv6] Fix a bug in the IPv6 listen behavior
The socket bind address should either be localhost or anyaddress. This bug in the listen behavior was preventing lldb-server from opening sockets for non-localhost connections.
The added test verifies that opening an anyaddress socket works and has a non-zero port assignment.
This should resolve PR34183.
------------------------------------------------------------------------
llvm-svn: 312016
------------------------------------------------------------------------
r311823 | rsmith | 2017-08-25 18:04:35 -0700 (Fri, 25 Aug 2017) | 16 lines
Add flag to request Clang is ABI-compatible with older versions of itself
This patch adds a flag -fclang-abi-compat that can be used to request that
Clang attempts to be ABI-compatible with some older version of itself.
This is provided on a best-effort basis; right now, this can be used to undo
the ABI change in r310401, reverting Clang to its prior C++ ABI for pass/return
by value of class types affected by that change, and to undo the ABI change in
r262688, reverting Clang to using integer registers rather than SSE registers
for passing <1 x long long> vectors. The intent is that we will maintain this
backwards compatibility path as we make ABI-breaking fixes in future.
The reversion to the old behavior for r310401 is also applied to the PS4 target
since that change is not part of its platform ABI (which is essentially to do
whatever Clang 3.2 did).
------------------------------------------------------------------------
llvm-svn: 312013
------------------------------------------------------------------------
r311792 | djasper | 2017-08-25 12:14:53 -0700 (Fri, 25 Aug 2017) | 9 lines
[Format] Invert nestingAndIndentLevel pair in WhitespaceManager used for
alignments
Indent should be compared before nesting level to determine if a token
is on the same scope as the one we align with. Because it was inverted,
clang-format sometimes tried to align tokens with tokens from outer
scopes, causing the assert(Shift >= 0) to fire.
This fixes bug #33507. Patch by Beren Minor, thank you!
------------------------------------------------------------------------
llvm-svn: 311800
------------------------------------------------------------------------
r311695 | rsmith | 2017-08-24 13:10:33 -0700 (Thu, 24 Aug 2017) | 9 lines
[ubsan] PR34266: When sanitizing the 'this' value for a member function that happens to be a lambda call operator, use the lambda's 'this' pointer, not the captured enclosing 'this' pointer (if any).
Do not sanitize the 'this' pointer of a member call operator for a lambda with
no capture-default, since that call operator can legitimately be called with a
null this pointer from the static invoker function. Any actual call with a null
this pointer should still be caught in the caller (if it is being sanitized).
This reinstates r311589 (reverted in r311680) with the above fix.
------------------------------------------------------------------------
llvm-svn: 311799
------------------------------------------------------------------------
r311674 | hans | 2017-08-24 10:00:36 -0700 (Thu, 24 Aug 2017) | 3 lines
Mark allocator_oom_test.cc unsupported on arm & aarch64 (PR33972)
The buildbots don't seem to like it.
------------------------------------------------------------------------
llvm-svn: 311736
------------------------------------------------------------------------
r311601 | adrian | 2017-08-23 14:24:12 -0700 (Wed, 23 Aug 2017) | 5 lines
Fix a bug in CGDebugInfo::EmitInlineFunctionStart causing DILocations to be
parented in function declarations.
Fixes PR33997.
https://bugs.llvm.org/show_bug.cgi?id=33997
------------------------------------------------------------------------
llvm-svn: 311671
------------------------------------------------------------------------
r311623 | hans | 2017-08-23 18:08:27 -0700 (Wed, 23 Aug 2017) | 11 lines
[DAG] Fix Node Replacement in PromoteIntBinOp
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.
Fixes PR34137.
Landing on behalf of Nirav.
Differential Revision: https://reviews.llvm.org/D36581
------------------------------------------------------------------------
llvm-svn: 311670
------------------------------------------------------------------------
r311555 | oleg | 2017-08-23 07:26:31 -0700 (Wed, 23 Aug 2017) | 14 lines
[ARM][Compiler-rt] Fix AEABI builtins to correctly pass arguments to non-AEABI functions on HF targets
Summary:
This is a patch for PR34167.
On HF targets functions like `__{eq,lt,le,ge,gt}df2` and `__{eq,lt,le,ge,gt}sf2` expect their arguments to be passed in d/s registers, while some of the AEABI builtins pass them in r registers.
Reviewers: compnerd, peter.smith, asl
Reviewed By: peter.smith, asl
Subscribers: peter.smith, aemerson, dberris, javed.absar, llvm-commits, asl, kristof.beyls
Differential Revision: https://reviews.llvm.org/D36675
------------------------------------------------------------------------
llvm-svn: 311606
------------------------------------------------------------------------
r311554 | mcrosier | 2017-08-23 07:10:06 -0700 (Wed, 23 Aug 2017) | 10 lines
[Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y)..
..if the resulting subtract will be broken up later. This can cause us to get
into an infinite loop.
x + (-5.0 * y) -> x - (5.0 * y) ; Canonicalize neg const
x - (5.0 * y) -> x + (0 - (5.0 * y)) ; Break up subtract
x + (0 - (5.0 * y)) -> x + (-5.0 * y) ; Replace 0-X with X*-1.
PR34078
------------------------------------------------------------------------
llvm-svn: 311603
------------------------------------------------------------------------
r311565 | hans | 2017-08-23 08:43:28 -0700 (Wed, 23 Aug 2017) | 8 lines
LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020)
The lowering isn't really an optimization, so optnone shouldn't make a
difference. ARM relies on the pass running when using "-mthread-model
single", because in that mode, it doesn't run AtomicExpand. See bug for
more details.
Differential Revision: https://reviews.llvm.org/D37040
------------------------------------------------------------------------
llvm-svn: 311602
This caused PR34080, which seems to have been fixed by r310792, but that change
introduced severe performance regressions.
Reverting to unblock the 5.0.0 release while these issues are worked out on trunk.
Also reverting a few tests that were added later and depended on the new scheduling:
LLVM :: CodeGen/X86/f16c-schedule.ll
LLVM :: CodeGen/X86/lea32-schedule.ll
LLVM :: CodeGen/X86/lea64-schedule.ll
LLVM :: CodeGen/X86/popcnt-schedule.ll
llvm-svn: 311600
The header change caused problems; see PR34182, and PR33858 from #9 onwards, as
well as the discussion on the r309226 cfe-commits thread.
These changes don't seem to be addressing any regression from 4.0.0, so rather
than scrambling to fix this on the branch, let's revert to safety.
llvm-svn: 311597
------------------------------------------------------------------------
r311572 | ctopper | 2017-08-23 09:41:02 -0700 (Wed, 23 Aug 2017) | 9 lines
[AVX512] Don't create SHRUNKBLEND SDNodes for 512-bit vectors
There are no 512-bit blend instructions so we shouldn't create SHRUNKBLEND for them.
On a side note, it looks like there may be a missed opportunity for constant folding TESTM when LHS and RHS are equal.
This fixes PR34139.
Differential Revision: https://reviews.llvm.org/D36992
------------------------------------------------------------------------
llvm-svn: 311593
------------------------------------------------------------------------
r311330 | ibiryukov | 2017-08-21 05:03:08 -0700 (Mon, 21 Aug 2017) | 16 lines
Fixed a crash on replaying Preamble's PP conditional stack.
Summary:
The crash occurs when the first token after a preamble is a macro
expansion.
Fixed by moving replayPreambleConditionalStack from Parser into
Preprocessor. It is now called right after the predefines file is
processed.
Reviewers: erikjv, bkramer, klimek, yvvan
Reviewed By: bkramer
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36872
------------------------------------------------------------------------
llvm-svn: 311591
------------------------------------------------------------------------
r311579 | compnerd | 2017-08-23 10:23:12 -0700 (Wed, 23 Aug 2017) | 6 lines
Process: fix FXSAVE on x86
The FXSAVE member `ftw` (FPU Tag Word) was given the wrong size (8-bit)
instead of the correct width (16-bit) as per the x87 Programmer's
Manual. Adjust this to ensure that we print out the complete value for
the register.
------------------------------------------------------------------------
llvm-svn: 311585
------------------------------------------------------------------------
r311496 | hans | 2017-08-22 14:54:37 -0700 (Tue, 22 Aug 2017) | 1 line
[profile] Fix warning about C++ style comment in C file
------------------------------------------------------------------------
llvm-svn: 311584
------------------------------------------------------------------------
r311495 | hans | 2017-08-22 14:54:37 -0700 (Tue, 22 Aug 2017) | 6 lines
[esan] Use stack_t instead of struct sigaltstack (PR34011)
The struct tag is going away in soon-to-be-released glibc 2.26 and the
stack_t typedef seems to have been there forever.
Patch by Bernhard Rosenkraenzer!
------------------------------------------------------------------------
llvm-svn: 311583
------------------------------------------------------------------------
r311532 | krasimir | 2017-08-23 00:18:36 -0700 (Wed, 23 Aug 2017) | 24 lines
[clang-format] Align trailing comments if ColumnLimit is 0
Summary:
ColumnLimit = 0 means no limit, so comment should always be aligned if requested. This was broken with
https://llvm.org/svn/llvm-project/cfe/trunk@304687
introduced via
https://reviews.llvm.org/D33830
and is included in 5.0.0-rc2. This commit fixes it and adds a unittest for this property.
Should go into clang-5.0 IMHO.
Contributed by @pboettch!
Reviewers: djasper, krasimir
Reviewed By: djasper, krasimir
Subscribers: hans, klimek
Differential Revision: https://reviews.llvm.org/D36967
------------------------------------------------------------------------
llvm-svn: 311573
------------------------------------------------------------------------
r311397 | ahatanak | 2017-08-21 15:46:46 -0700 (Mon, 21 Aug 2017) | 8 lines
[Driver][Darwin] Do not pass -munwind-table if -fno-excpetions is
supplied.
With this change, -fno-exceptions disables unwind tables unless
-funwind-tables is supplied too or the target is x86-64 (x86-64 requires
emitting unwind tables).
rdar://problem/33934446
------------------------------------------------------------------------
llvm-svn: 311505
------------------------------------------------------------------------
r311391 | stl_msft | 2017-08-21 15:19:33 -0700 (Mon, 21 Aug 2017) | 28 lines
[Driver] Recognize DevDiv internal builds of MSVC, with a different directory structure.
This is a reasonably non-intrusive change, which I've verified
works for both x86 and x64 DevDiv-internal builds.
The idea is to change `bool IsVS2017OrNewer` into a 3-state
`ToolsetLayout VSLayout`. Either a build is DevDiv-internal,
released VS 2017 or newer, or released VS 2015 or older. When looking at
the directory structure, if instead of `"VC"` we see `"x86ret"`, `"x86chk"`,
`"amd64ret"`, or `"amd64chk"`, we recognize this as a DevDiv-internal build.
After we get past the directory structure validation, we use this knowledge
to regenerate paths appropriately. `llvmArchToDevDivInternalArch()` knows how
we use `"i386"` subdirectories, and `MSVCToolChain::getSubDirectoryPath()`
uses that. It also knows that DevDiv-internal builds have an `"inc"`
subdirectory instead of `"include"`.
This may still not be the "right" fix in any sense, but I believe that it's
non-intrusive in the sense that if the special directory names aren't found,
no codepaths are affected. (`ToolsetLayout::OlderVS` and
`ToolsetLayout::VS2017OrNewer` correspond to `IsVS2017OrNewer` being `false`
or `true`, respectively.) I searched for all references to `IsVS2017OrNewer`,
which are places where Clang cares about VS's directory structure, and the
only one that isn't being patched is some logic to deal with
cross-compilation. I'm fine with that not working for DevDiv-internal builds
for the moment (we typically test the native compilers), so I added a comment.
Fixes D36860.
------------------------------------------------------------------------
llvm-svn: 311500
------------------------------------------------------------------------
r311387 | steven_wu | 2017-08-21 14:49:13 -0700 (Mon, 21 Aug 2017) | 16 lines
[IR] AutoUpgrade ModuleFlagBehavior for PIC and PIE level
Summary:
From r303590, ModuleFlagBehavior for PIC and PIE level is changed from
Error to Max. This will cause bitcode compatibility issue when linking
against a bitcode static archive built with old compiler.
Add an auto-ugprade path to upgrade the the ModuleFlagBehavior in the
old bitcode to match the new one so IRLinker can link them.
Reviewers: tejohnson, mehdi_amini, dexonsmith
Reviewed By: dexonsmith
Subscribers: hans, llvm-commits
Differential Revision: https://reviews.llvm.org/D36556
------------------------------------------------------------------------
llvm-svn: 311481
------------------------------------------------------------------------
r311263 | ctopper | 2017-08-19 15:02:02 -0700 (Sat, 19 Aug 2017) | 1 line
[AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.
------------------------------------------------------------------------
llvm-svn: 311478
------------------------------------------------------------------------
r311443 | arphaman | 2017-08-22 03:38:07 -0700 (Tue, 22 Aug 2017) | 15 lines
[ObjC] Check written attributes only when synthesizing ambiguous property
This commit fixes a bug introduced in r307903. The attribute ambiguity checker
that was introduced in r307903 checked all property attributes, which caused
errors for source-compatible properties, like:
@property (nonatomic, readonly) NSObject *prop;
@property (nonatomic, readwrite) NSObject *prop;
because the readwrite property would get implicit 'strong' attribute. The
ambiguity checker should be concerned about explicitly specified attributes
only.
rdar://33748089
------------------------------------------------------------------------
llvm-svn: 311464
------------------------------------------------------------------------
r311429 | ctopper | 2017-08-21 22:40:17 -0700 (Mon, 21 Aug 2017) | 9 lines
[X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.
This patch just adds a flag to prevent the APInt from changing width.
Fixes PR34271.
Differential Revision: https://reviews.llvm.org/D36996
------------------------------------------------------------------------
llvm-svn: 311462
------------------------------------------------------------------------
r311061 | compnerd | 2017-08-16 19:42:24 -0700 (Wed, 16 Aug 2017) | 10 lines
ARM: mark CPSR as clobbered for Windows VLAs
When lowering a VLA, we emit a __chstk call. However, this call can
internally clobber CPSR. We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`. In such a case, the CPSR could be clobbered, and the
check invalidated. When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case. Mark the register as clobbered to fix a possible
state corruption.
------------------------------------------------------------------------
llvm-svn: 311461
Add notes on this to the C language section, along with the C++ section.
Reviewers: bruno, hans
Differential Revision: https://reviews.llvm.org/D36954
llvm-svn: 311441
------------------------------------------------------------------------
r310983 | rsmith | 2017-08-15 18:49:53 -0700 (Tue, 15 Aug 2017) | 31 lines
PR19668, PR23034: Fix handling of move constructors and deleted copy
constructors when deciding whether classes should be passed indirectly.
This fixes ABI differences between Clang and GCC:
* Previously, Clang ignored the move constructor when making this
determination. It now takes the move constructor into account, per
https://github.com/itanium-cxx-abi/cxx-abi/pull/17 (this change may
seem recent, but the ABI change was agreed on the Itanium C++ ABI
list a long time ago).
* Previously, Clang's behavior when the copy constructor was deleted
was unstable -- depending on whether the lazy declaration of the
copy constructor had been triggered, you might get different behavior.
We now eagerly declare the copy constructor whenever its deletedness
is unclear, and ignore deleted copy/move constructors when looking for
a trivial such constructor.
This also fixes an ABI difference between Clang and MSVC:
* If the copy constructor would be implicitly deleted (but has not been
lazily declared yet), for instance because the class has an rvalue
reference member, we would pass it directly. We now pass such a class
indirectly, matching MSVC.
Based on a patch by Vassil Vassilev, which was based on a patch by Bernd
Schmidt, which was based on a patch by Reid Kleckner!
This is a re-commit of r310401, which was reverted in r310464 due to ARM
failures (which should now be fixed).
------------------------------------------------------------------------
llvm-svn: 311410
------------------------------------------------------------------------
r311071 | eladcohen | 2017-08-17 01:06:36 -0700 (Thu, 17 Aug 2017) | 13 lines
[SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.
This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.
Differential revision: https://reviews.llvm.org/D36651
------------------------------------------------------------------------
llvm-svn: 311409
------------------------------------------------------------------------
r310990 | mstorsjo | 2017-08-15 22:18:36 -0700 (Tue, 15 Aug 2017) | 18 lines
[llvm-dlltool] Fix creating stdcall/fastcall import libraries for i386
Hook up the -k option (that in the original GNU dlltool removes the
@n suffix from the symbol that the final executable ends up linked to).
In llvm-dlltool, make sure that functions end up with the undecorate
name type if this option is set and they are decorated. In mingw, when
creating import libraries from def files instead of creating an import
library as a side effect of linking a DLL, the symbol names in the def
contain the stdcall/fastcall decoration (but no leading underscore).
By setting the undecorate name type, a linker linking to the import
library will omit the decoration from the DLL import entry.
With this in place, mingw-w64 for i386 built with llvm-dlltool/clang
produces import libraries that actually work.
Differential Revision: https://reviews.llvm.org/D36548
------------------------------------------------------------------------
llvm-svn: 311408
------------------------------------------------------------------------
r311182 | alexshap | 2017-08-18 11:20:43 -0700 (Fri, 18 Aug 2017) | 22 lines
[analyzer] Fix modeling of constructors
This diff fixes analyzer's crash (triggered assert) on the newly added test case.
The assert being discussed is assert(!B.lookup(R, BindingKey::Direct))
in lib/StaticAnalyzer/Core/RegionStore.cpp, however the root cause is different.
For classes with empty bases the offsets might be tricky.
For example, let's assume we have
struct S: NonEmptyBase, EmptyBase {
...
};
In this case Clang applies empty base class optimization and
the offset of EmptyBase will be 0, it can be verified via
clang -cc1 -x c++ -v -fdump-record-layouts main.cpp -emit-llvm -o /dev/null.
When the analyzer tries to perform zero initialization of EmptyBase
it will hit the assert because that region
has already been "written" by the constructor of NonEmptyBase.
Test plan:
make check-all
Differential revision: https://reviews.llvm.org/D36851
------------------------------------------------------------------------
llvm-svn: 311378
------------------------------------------------------------------------
r311122 | mgorny | 2017-08-17 13:33:21 -0700 (Thu, 17 Aug 2017) | 20 lines
[cmake] Add explicit linkage from Core to curses
The Core library calls functions provided by the curses library. Add
an appropriate explicit LINK_LIBS to ${CURSES_LIBRARIES} to propagate
the dependency correctly within the build system.
It seems that so far the linkage was handled by some kind of implicit
magic LLDB_SYSTEM_LIBS variable. However, it stopped working for
unittests as the curses libraries are passed before the LLDBCore
library, resulting in `-Wl,--as-needed` stripping the yet-unused library
before it is required by LLDBCore, and effectively breaking the build.
I think it's better to focus on listing all the dependencies explicitly
and let CMake propagate them rather than trying to figure out why this
hack stopped working.
This is also more consistent with LLVM where the curses linkage
in LLVMSupport is expressed directly in the library rather than deferred
to the final programs.
Differential Revision: https://reviews.llvm.org/D36358
------------------------------------------------------------------------
------------------------------------------------------------------------
r311354 | mgorny | 2017-08-21 10:41:33 -0700 (Mon, 21 Aug 2017) | 19 lines
[cmake] Explicitly link dependency libraries in the Host library
Add explicit linkage to the necessary system libraries in the Host
library. Otherwise, the library fails to build with -Wl,--as-needed.
The system libraries ended up being listed on the linker command-line
before the static libraries needing them, resulting in --as-needed
stripping them.
Listing the dependent libraries explicitly is the canonical way of
declaring libraries in CMake. It guarantees that the system library
dependencies will be correctly propagated to reverse dependencies.
The code used to link libraries reuses existing EXTRA_LIBS variable,
copying code from other parts of LLDB. We might eventually remove
the direct use of system libraries in the programs; however, I would
prefer if we focused on fixing the build regressions in 5.0 branch
first, and went further after the release.
Differential Revision: https://reviews.llvm.org/D36885
------------------------------------------------------------------------
------------------------------------------------------------------------
r311355 | mgorny | 2017-08-21 10:41:39 -0700 (Mon, 21 Aug 2017) | 10 lines
[unittests] Build LLVMTestingSupport for out-of-source builds
The Process/gdb-remote test now requires the LLVMTestingSupport library
that is not installed by LLVM. As a result, when doing an out-of-source
build it fails being unable to find the library. To solve that, build
a local copy of the library when building LLDB with unittests and LLVM
sources available. This is based on how we deal with bundled gtest
sources.
Differential Revision: https://reviews.llvm.org/D36886
------------------------------------------------------------------------
llvm-svn: 311377
------------------------------------------------------------------------
r310712 | mgorny | 2017-08-11 06:25:20 -0700 (Fri, 11 Aug 2017) | 26 lines
[cmake] Expose the dependencies of ExecutionEngine as PUBLIC
Expose the dependencies of LLVMExecutionEngine library as PUBLIC rather
than PRIVATE when building a shared library. This is necessary because
the library is not contained but exposes API of other LLVM libraries via
its headers.
This causes other libraries to fail to link if the linker verifies for
correctness of -l flags (i.e. fails on indirect dependencies). This e.g.
happens when building LLDB against shared LLVM:
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTIN4llvm18MCJITMemoryManagerE[_ZTIN4llvm18MCJITMemoryManagerE]+0x10): undefined reference to `typeinfo for llvm::RuntimeDyld::MemoryManager'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN4llvm18MCJITMemoryManagerE[_ZTVN4llvm18MCJITMemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x48): undefined reference to `llvm::RTDyldMemoryManager::deregisterEHFrames()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0xd0): undefined reference to `llvm::JITSymbolResolver::anchor()'
collect2: error: ld returned 1 exit status
Declaring the dependencies as PUBLIC guarantees that any package using
the ExecutionEngine library will also get explicit -l flags for
the dependent libraries guaranteeing that the symbols exposed in headers
could be resolved.
Patch originally written by NAKAMURA Takumi.
Differential Revision: https://reviews.llvm.org/D36211
------------------------------------------------------------------------
llvm-svn: 311376
------------------------------------------------------------------------
r311229 | chandlerc | 2017-08-18 23:56:11 -0700 (Fri, 18 Aug 2017) | 19 lines
[Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.
We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.
This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.
The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.
Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]
------------------------------------------------------------------------
llvm-svn: 311372
------------------------------------------------------------------------
r311258 | mstorsjo | 2017-08-19 12:47:48 -0700 (Sat, 19 Aug 2017) | 9 lines
[ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.
This fixes PR33921.
Differential Revision: https://reviews.llvm.org/D36899
------------------------------------------------------------------------
llvm-svn: 311369
Summary: This patch should be applied to clang 5.0 release notes, NOT to trunk.
Reviewers: rengolin, hans
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D36902
llvm-svn: 311219
------------------------------------------------------------------------
r309474 | smeenai | 2017-07-28 19:54:41 -0700 (Fri, 28 Jul 2017) | 9 lines
[libc++] Hoist extern template above first use
This function template is referenced inside class basic_string as a
friend function. The extern template declaration needs to be above that
friend declaration to actually take effect.
This is important because this function was marked as exported in
r307966, so without the extern template taking effect, it can leak into
other DSOs as a visible symbol.
------------------------------------------------------------------------
llvm-svn: 311197
------------------------------------------------------------------------
r310262 | sdardis | 2017-08-07 08:37:57 -0700 (Mon, 07 Aug 2017) | 9 lines
[DebugInfo][DWARF] Correct some usages of PRIx32 to PRIx64
These lead to tests failing spuriously as the values after being rendered to a
string were incorrect.
Reviewers: clayborg
Differential Revision: https://reviews.llvm.org/D36319
------------------------------------------------------------------------
llvm-svn: 311196
------------------------------------------------------------------------
r311087 | sdardis | 2017-08-17 07:14:25 -0700 (Thu, 17 Aug 2017) | 15 lines
[dfsan] Add explicit zero extensions for shadow parameters in function wrappers.
In the case where dfsan provides a custom wrapper for a function,
shadow parameters are added for each parameter of the function.
These parameters are i16s. For targets which do not consider this
a legal type, the lack of sign extension information would cause
LLVM to generate anyexts around their usage with phi variables
and calling convention logic.
Address this by introducing zero exts for each shadow parameter.
Reviewers: pcc, slthakur
Differential Revision: https://reviews.llvm.org/D33349
------------------------------------------------------------------------
llvm-svn: 311195
------------------------------------------------------------------------
r310498 | guyblank | 2017-08-09 10:21:01 -0700 (Wed, 09 Aug 2017) | 9 lines
[X86][AVX512] Choose correct registers in vpbroadcastb/w
Fixes the vpbroadcastb/w instructions which use GPRs as source operands, to use the correct registers.
The full GPR should be used, and not the subregister, as it happens before the patch.
Fixes pr33795
Differential Revision:
https://reviews.llvm.org/D36479
------------------------------------------------------------------------
llvm-svn: 311110
------------------------------------------------------------------------
r309936 | chapuni | 2017-08-03 06:30:43 -0700 (Thu, 03 Aug 2017) | 1 line
ClangdTests: Try to unbreak the case CLANG_DEFAULT_CXX_STDLIB=libc++.
------------------------------------------------------------------------
llvm-svn: 311109
------------------------------------------------------------------------
r310979 | qcolombet | 2017-08-15 17:17:05 -0700 (Tue, 15 Aug 2017) | 38 lines
[VirtRegRewriter] Properly model the register liveness on undef subreg definition
Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.
The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.
E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1
Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
; is believed to come
; from the previous
; instruction
Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107
Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>
The bug has been here forever, it got exposed recently because the register
scavenger got smarter.
Fixes llvm.org/PR34107
------------------------------------------------------------------------
llvm-svn: 311108
------------------------------------------------------------------------
r310906 | marsupial | 2017-08-14 19:25:36 -0700 (Mon, 14 Aug 2017) | 12 lines
Propagate error in LazyEmittingLayer::removeModule.
Summary:
Besides being the better thing to do, not doing so will triggers an assert with LLVM_ENABLE_ABI_BREAKING_CHECKS.
Reviewers: lhames
Reviewed By: lhames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36700
------------------------------------------------------------------------
llvm-svn: 311107
------------------------------------------------------------------------
r310776 | rsmith | 2017-08-11 18:46:03 -0700 (Fri, 11 Aug 2017) | 8 lines
PR34163: Don't cache an incorrect key function for a class if queried between
the class becoming complete and its inline methods being parsed.
This replaces the hack of using the "late parsed template" flag to track member
functions with bodies we've not parsed yet; instead we now use the "will have
body" flag, which carries the desired implication that the function declaration
*is* a definition, and that we've just not parsed its body yet.
------------------------------------------------------------------------
llvm-svn: 311105
------------------------------------------------------------------------
r311068 | mstorsjo | 2017-08-16 22:58:27 -0700 (Wed, 16 Aug 2017) | 3 lines
[llvm-dlltool] Don't crash if no def file is provided or it can't be opened
Differential Revision: https://reviews.llvm.org/D36780
------------------------------------------------------------------------
llvm-svn: 311104
------------------------------------------------------------------------
r310992 | mstorsjo | 2017-08-15 22:23:00 -0700 (Tue, 15 Aug 2017) | 11 lines
[COFF] Don't produce weak aliases in import libraries
When creating an import library from lld, the cases with
Name != ExtName shouldn't end up as a weak alias, but as a real
export of the new name, which is what actually is exported from
the DLL.
This restores the behaviour of renamed exports to what it was in
4.0.
Differential Revision: https://reviews.llvm.org/D36634
------------------------------------------------------------------------
llvm-svn: 311101
------------------------------------------------------------------------
r310991 | mstorsjo | 2017-08-15 22:22:49 -0700 (Tue, 15 Aug 2017) | 13 lines
[COFF] Make the weak aliases optional
When creating an import library from lld, the cases with
Name != ExtName shouldn't end up as a weak alias, but as a real
export of the new name, which is what actually is exported from
the DLL.
This restores the behaviour of renamed exports to what it was in
4.0.
The other half of this commit, including test, goes into lld.
Differential Revision: https://reviews.llvm.org/D36633
------------------------------------------------------------------------
llvm-svn: 311100
------------------------------------------------------------------------
r310989 | mstorsjo | 2017-08-15 22:13:25 -0700 (Tue, 15 Aug 2017) | 18 lines
[COFF] Fix the name type for stdcall functions in import libraries
Since SVN r303491 and r304573, LLD used the COFFImportLibrary
functions from LLVM. These only had two names, Name and ExtName,
which wasn't enough to convey all the details of stdcall functions.
Stdcall functions got the wrong symbol name in the import library
itself in r303491, which is why it was reverted in r304561. When
re-landed and fixed in r304573 (after adding a test in r304572),
the symbol name itself in the import library ended up right, but the
name type of the import library entry was wrong.
This had the effect that linking to the import library succeeded
(contrary to in r303491, where linking to such an import library
failed), but at runtime, the symbol wouldn't be found in the DLL
(since the caller linked to the stdcall decorated name).
Differential Revision: https://reviews.llvm.org/D36545
------------------------------------------------------------------------
llvm-svn: 311098
------------------------------------------------------------------------
r310988 | mstorsjo | 2017-08-15 22:13:16 -0700 (Tue, 15 Aug 2017) | 8 lines
[COFF] Add SymbolName as a distinct field in COFFImportFile
The previous Name and ExtName aren't enough to convey all the nuances
between weak aliases and stdcall decorated function names.
A test for this will be added in LLD.
Differential Revision: https://reviews.llvm.org/D36544
------------------------------------------------------------------------
llvm-svn: 311096
------------------------------------------------------------------------
r310672 | ahatanak | 2017-08-10 17:06:49 -0700 (Thu, 10 Aug 2017) | 7 lines
[Sema][ObjC] Fix spurious -Wcast-qual warnings.
We do not meaningfully track object const-ness of Objective-C object
types. Silence the -Wcast-qual warning that is issued when casting to or
from Objective-C object types results in losing const qualification.
rdar://problem/33807915
------------------------------------------------------------------------
llvm-svn: 311095
------------------------------------------------------------------------
r310939 | tstellar | 2017-08-15 11:11:56 -0700 (Tue, 15 Aug 2017) | 16 lines
test-release.sh: Move test-suite setup to beginning of the script
Summary:
We want to catch failures early before do the full 3 stage build.
The goal here is to avoid running through the whole build process and have
it fail at the end (and not create the binary packages), just because
some prerequisites failed to install.
Reviewers: rovka, hans
Reviewed By: hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36422
------------------------------------------------------------------------
llvm-svn: 311094
------------------------------------------------------------------------
r310926 | steven_wu | 2017-08-15 09:16:33 -0700 (Tue, 15 Aug 2017) | 13 lines
[Doc] Update LangRef for new Module Flag Behavior
Summary:
Add the documentation for the new module flag behavior. The new
ModFlagBehavior is added in r303590.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36557
------------------------------------------------------------------------
llvm-svn: 311092
------------------------------------------------------------------------
r310706 | arphaman | 2017-08-11 05:06:52 -0700 (Fri, 11 Aug 2017) | 11 lines
[modules] Set the lexical DC for dummy tag decls that refer to hidden
declarations that are made visible after the dummy is parsed and ODR verified
Prior to this commit the
"(getContainingDC(DC) == CurContext && "The next DeclContext should be lexically contained in the current one."),"
assertion failure was triggered during semantic analysis of the dummy
tag declaration that was declared in another tag declaration because its
lexical context did not point to the outer tag decl.
rdar://32292196
------------------------------------------------------------------------
------------------------------------------------------------------------
r310829 | arphaman | 2017-08-14 03:59:44 -0700 (Mon, 14 Aug 2017) | 5 lines
Set the lexical context for dummy tag decl inside createTagFromNewDecl
This is a follow-up to r310706. This change has been recommended by
Bruno Cardoso Lopes and Richard Smith.
------------------------------------------------------------------------
llvm-svn: 310902
------------------------------------------------------------------------
r310796 | asb | 2017-08-13 11:49:33 -0700 (Sun, 13 Aug 2017) | 16 lines
Remove RISCV from LLVM_ALL_TARGETS in CMakeLists.txt
It was mistakenly added to that list in D23560 (committed in rL285712). RISCV
is an experimental backend and should never have been in that list, I
mistakenly interpreted LLVM_ALL_TARGETS as a list of all targets rather than
targets to build by default. Unfortunately, because of this the RISCV backend
has been building by default when it shouldn't be.
This commet adds a description comment, which should help to avoid such
mistakes in the future.
See my message to llvm-dev for more information and analysis
<http://lists.llvm.org/pipermail/llvm-dev/2017-August/116347.html>.
Differential Revision: https://reviews.llvm.org/D36538
------------------------------------------------------------------------
llvm-svn: 310900
------------------------------------------------------------------------
r310784 | ctopper | 2017-08-12 13:19:44 -0700 (Sat, 12 Aug 2017) | 16 lines
[X86] When handling addcarry intrinsic, create the flag result with the correct type so we don't crash if we use a memory instruction
Summary:
Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result.
Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types.
We should probably consider this for 5.0 as well.
Reviewers: RKSimon, zvi, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36645
------------------------------------------------------------------------
llvm-svn: 310899
------------------------------------------------------------------------
r309945 | spatel | 2017-08-03 08:07:37 -0700 (Thu, 03 Aug 2017) | 2 lines
[BDCE] add tests to show invalid/incomplete transforms
------------------------------------------------------------------------
------------------------------------------------------------------------
r310779 | spatel | 2017-08-12 09:41:08 -0700 (Sat, 12 Aug 2017) | 13 lines
[BDCE] clear poison generators after turning a value into zero (PR33695, PR34037)
nsw, nuw, and exact carry implicit assumptions about their operands, so we need
to clear those after trivializing a value. We decided there was no danger for
llvm.assume or metadata, so there's just a comment about that.
This fixes miscompiles as shown in:
https://bugs.llvm.org/show_bug.cgi?id=33695https://bugs.llvm.org/show_bug.cgi?id=34037
Differential Revision: https://reviews.llvm.org/D36592
------------------------------------------------------------------------
------------------------------------------------------------------------
r310842 | spatel | 2017-08-14 08:13:46 -0700 (Mon, 14 Aug 2017) | 16 lines
[BDCE] reduce scope of an assert (PR34179)
The assert was added with r310779 and is usually correct,
but as the test shows, not always. The 'volatile' on the
load is needed to expose the faulty path because without
it, DemandedBits would return that the load is just dead
rather than not demanded, and so we wouldn't hit the
bogus assert.
Also, since the lambda is just a single-line now, get rid
of it and inline the DB.isAllOnesValue() calls.
This should fix (prevent execution of a faulty assert):
https://bugs.llvm.org/show_bug.cgi?id=34179
------------------------------------------------------------------------
llvm-svn: 310898
------------------------------------------------------------------------
r310516 | hans | 2017-08-09 13:12:53 -0700 (Wed, 09 Aug 2017) | 13 lines
Make -std=c++17 an alias of -std=c++1z
As suggested on PR33912.
Trying to keep this small to make it easy to merge to the 5.0 branch. We
can do a follow-up with more thorough renaming (diagnostic text,
options, ids, etc.) later.
(For C++14 this was done in r215982, and I think a smaller patch for the
3.5 branch:
http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20140818/113013.html)
Differential Revision: https://reviews.llvm.org/D36532
------------------------------------------------------------------------
llvm-svn: 310848
------------------------------------------------------------------------
r309044 | mgorny | 2017-07-25 15:38:31 -0700 (Tue, 25 Jul 2017) | 14 lines
[lit] Fix UnboundLocalError for invalid shtest redirects
Replace the incorrect variable reference when invalid redirect is used.
This fixes the following issue:
File "/usr/src/llvm/utils/lit/lit/TestRunner.py", line 316, in processRedirects
raise InternalShellError(cmd, "Unsupported redirect: %r" % (r,))
UnboundLocalError: local variable 'r' referenced before assignment
which in turn broke shtest-shell.py and max-failures.py lit tests.
The breakage was introduced during refactoring in rL307310.
Differential Revision: https://reviews.llvm.org/D35857
------------------------------------------------------------------------
------------------------------------------------------------------------
r309071 | rnk | 2017-07-25 18:27:18 -0700 (Tue, 25 Jul 2017) | 1 line
[lit] Attempt to fix Python unittest adaptor logic
------------------------------------------------------------------------
------------------------------------------------------------------------
r309120 | modocache | 2017-07-26 07:59:36 -0700 (Wed, 26 Jul 2017) | 36 lines
Revert "[lit] Remove dead code not referenced in the LLVM SVN repo."
Summary:
This reverts rL306623, which removed `FileBasedTest`, an abstract base class,
but did not also remove the usages of that class in the lit unit tests.
The revert fixes four test failures in the lit unit test suite.
Test plan:
As per the instructions in `utils/lit/README.txt`, run the lit unit
test suite:
```
utils/lit/lit.py \
--path /path/to/your/llvm/build/bin \
utils/lit/tests
```
Verify that the following tests fail before applying this patch, and
pass once the patch is applied:
```
lit :: test-data.py
lit :: test-output.py
lit :: xunit-output.py
```
In addition, run `check-llvm` to make sure the existing LLVM test suite
executes normally.
Reviewers: george.karpenkov, mgorny, dlj
Reviewed By: mgorny
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35877
------------------------------------------------------------------------
------------------------------------------------------------------------
r309122 | modocache | 2017-07-26 08:02:05 -0700 (Wed, 26 Jul 2017) | 32 lines
[lit] Fix type error for parallelism groups
Summary:
Whereas rL299560 and rL309071 call `parallelism_groups.items()`, under the
assumption that `parallelism_groups` is a `dict` type, the default
parameter for that attribute is a `list`. Change the default to a
`dict` for type correctness.
This regression in the unit tests would have been caught if the
unit tests were being run continously. It also would have been caught
if the lit project used a Python type checker such as `mypy`.
Test Plan:
As per the instructions in `utils/lit/README.txt`, run the lit unit
test suite:
```
utils/lit/lit.py \
--path /path/to/your/llvm/build/bin \
utils/lit/tests
```
Verify that the test `lit :: unit/TestRunner.py` fails before applying this
patch, but passes once this patch is applied.
Reviewers: mgorny, rnk, rafael
Reviewed By: mgorny
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35878
------------------------------------------------------------------------
------------------------------------------------------------------------
r309140 | george.karpenkov | 2017-07-26 10:19:36 -0700 (Wed, 26 Jul 2017) | 3 lines
Fix LIT test breakage
Differential Revision: https://reviews.llvm.org/D35867
------------------------------------------------------------------------
------------------------------------------------------------------------
r309227 | rnk | 2017-07-26 15:57:32 -0700 (Wed, 26 Jul 2017) | 4 lines
[lit] Fix race between shtest-shell and max-failures tests
Previously these tests would use the same Output directory leading to
flaky non-deterministic failures.
------------------------------------------------------------------------
llvm-svn: 310730
------------------------------------------------------------------------
r310704 | smaksimovic | 2017-08-11 04:39:07 -0700 (Fri, 11 Aug 2017) | 8 lines
Revert r302670 for the upcoming 5.0.0 release
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
------------------------------------------------------------------------
llvm-svn: 310728
------------------------------------------------------------------------
r310700 | yamaguchi | 2017-08-11 02:44:42 -0700 (Fri, 11 Aug 2017) | 11 lines
[Bash-autocompletion] Add --autocomplete flag to 5.0 release notes
Summary:
I thought we should add this information to release notes, because we
added a new flag to clang driver.
Reviewers: v.g.vassilev, teemperor, ruiu
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36567
------------------------------------------------------------------------
llvm-svn: 310723
------------------------------------------------------------------------
r310691 | rsmith | 2017-08-10 19:04:19 -0700 (Thu, 10 Aug 2017) | 2 lines
PR33489: A function-style cast to a deduced class template specialization type is type-dependent if it can't be resolved due to a type-dependent argument.
------------------------------------------------------------------------
llvm-svn: 310719
------------------------------------------------------------------------
r310006 | ahatanak | 2017-08-03 16:55:42 -0700 (Thu, 03 Aug 2017) | 22 lines
[Driver][Darwin] Pass -munwind-table when !UseSjLjExceptions.
This commit fixes a bug where clang/llvm doesn't emit an unwind table
for a function when it is marked noexcept. Without this patch, the
following code terminates with an uncaught exception on ARM64:
int foo1() noexcept {
try {
throw 0;
} catch (int i) {
return 0;
}
return 1;
}
int main() {
return foo1();
}
rdar://problem/32411865
Differential Revision: https://reviews.llvm.org/D35693
------------------------------------------------------------------------
llvm-svn: 310677
------------------------------------------------------------------------
r309633 | ahatanak | 2017-07-31 15:19:34 -0700 (Mon, 31 Jul 2017) | 6 lines
[Driver] Make sure the deployment target is earlier than iOS 11 when
it is inferred from -isysroot.
This fixes a change that was inadvertently introduced in r309607.
rdar://problem/32230613
------------------------------------------------------------------------
------------------------------------------------------------------------
r309636 | ahatanak | 2017-07-31 15:46:00 -0700 (Mon, 31 Jul 2017) | 1 line
Silence warning -Wmissing-sysroot.
------------------------------------------------------------------------
------------------------------------------------------------------------
r309640 | ahatanak | 2017-07-31 16:08:52 -0700 (Mon, 31 Jul 2017) | 1 line
Use -target instead of -arch in test case.
------------------------------------------------------------------------
llvm-svn: 310676
------------------------------------------------------------------------
r309607 | ahatanak | 2017-07-31 12:16:40 -0700 (Mon, 31 Jul 2017) | 6 lines
[Driver] Allow users to silence the warning that is issued when the
deployment target is earlier than iOS 11 and the target is 32-bit.
This is a follow-up to r306922.
rdar://problem/32230613
------------------------------------------------------------------------
llvm-svn: 310675
------------------------------------------------------------------------
r309569 | alexfh | 2017-07-31 08:21:26 -0700 (Mon, 31 Jul 2017) | 39 lines
Fix -Wshadow false positives with function-local classes.
Summary:
Fixes http://llvm.org/PR33947.
https://godbolt.org/g/54XRMT
void f(int a) {
struct A {
void g(int a) {}
A() { int a; }
};
}
3 : <source>:3:16: warning: declaration shadows a local variable [-Wshadow]
void g(int a) {}
^
1 : <source>:1:12: note: previous declaration is here
void f(int a) {
^
4 : <source>:4:15: warning: declaration shadows a local variable [-Wshadow]
A() { int a; }
^
1 : <source>:1:12: note: previous declaration is here
void f(int a) {
^
2 warnings generated.
The local variable `a` of the function `f` can't be accessed from a method of
the function-local class A, thus no shadowing occurs and no diagnostic is
needed.
Reviewers: rnk, rsmith, arphaman, Quuxplusone
Reviewed By: rnk, Quuxplusone
Subscribers: Quuxplusone, cfe-commits
Differential Revision: https://reviews.llvm.org/D35941
------------------------------------------------------------------------
llvm-svn: 310674
------------------------------------------------------------------------
r310552 | eladcohen | 2017-08-10 00:44:23 -0700 (Thu, 10 Aug 2017) | 19 lines
[SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.
When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.
The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.
This fixes pr33349.
Differential revision: https://reviews.llvm.org/D36511
------------------------------------------------------------------------
llvm-svn: 310635
------------------------------------------------------------------------
r310481 | davide | 2017-08-09 08:13:50 -0700 (Wed, 09 Aug 2017) | 8 lines
[ValueTracking] Honour recursion limit.
The recently improved support for `icmp` in ValueTracking
(r307304) exposes the fact that `isImplied` condition doesn't
really bail out if we hit the recursion limit (and calls
`computeKnownBits` which increases the depth and asserts).
Differential Revision: https://reviews.llvm.org/D36512
------------------------------------------------------------------------
------------------------------------------------------------------------
r310492 | davide | 2017-08-09 09:06:04 -0700 (Wed, 09 Aug 2017) | 1 line
[ValueTracking] Update tests to unbreak the bots.
------------------------------------------------------------------------
------------------------------------------------------------------------
r310510 | spatel | 2017-08-09 11:56:26 -0700 (Wed, 09 Aug 2017) | 6 lines
[SimplifyCFG] remove checks for crasher test from r310481
Not sure why the earlier version would fail, but trying to get the bots
(and my local machine) to pass again.
------------------------------------------------------------------------
llvm-svn: 310634
------------------------------------------------------------------------
r308847 | mkazantsev | 2017-07-23 08:40:19 -0700 (Sun, 23 Jul 2017) | 24 lines
[SCEV] Limit max size of AddRecExpr during evolving
When SCEV calculates product of two SCEVAddRecs from the same loop, it
tries to combine them into one big AddRecExpr. If the sizes of the initial
SCEVs were `S1` and `S2`, the size of their product is `S1 + S2 - 1`, and every
operand of the resulting SCEV is combined from operands of initial SCEV and
has much higher complexity than they have.
As result, if we try to calculate something like:
%x1 = {a,+,b}
%x2 = mul i32 %x1, %x1
%x3 = mul i32 %x2, %x1
%x4 = mul i32 %x3, %x2
...
The size of such SCEVs grows as `2^N`, and the arguments
become more and more complex as we go forth. This leads
to long compilation and huge memory consumption.
This patch sets a limit after which we don't try to combine two
`SCEVAddRecExpr`s into one. By default, max allowed size of the
resulting AddRecExpr is set to 16.
Differential Revision: https://reviews.llvm.org/D35664
------------------------------------------------------------------------
llvm-svn: 310629
------------------------------------------------------------------------
r310534 | matze | 2017-08-09 15:22:05 -0700 (Wed, 09 Aug 2017) | 20 lines
ARM: Fix CMP_SWAP expansion
Clean up after my misguided attempt in r304267 to "fix" CMP_SWAP
returning an uninitialized status value.
- I was always using tMOVi8 to zero the status register which cannot
encode higher register numbers and llvm would silently miscompile)
- Nobody was ever looking at that status value outside the expansion.
ARMDAGToDAGISel::SelectCMP_SWAP() the only place creating CMP_SWAP
instructions was not mapping anything to it. (The cmpxchg status value
from llvm IR is lowered to a manual comparison after the CMP_SWAP)
So this:
- Renames the register from "status" to "temp" it make it obvious that
it isn't used outside the expansion.
- Remove the zeroing status/temp register.
- Keep the live-in list improvements from r304267
Fixes http://llvm.org/PR34056
------------------------------------------------------------------------
llvm-svn: 310628
------------------------------------------------------------------------
r309758 | sanjoy | 2017-08-01 15:37:58 -0700 (Tue, 01 Aug 2017) | 6 lines
[SCEV/IndVars] Always compute loop exiting values if the backedge count is 0
If SCEV can prove that the backedge taken count for a loop is zero, it does not
need to "understand" a recursive PHI to compute its exiting value.
This should fix PR33885.
------------------------------------------------------------------------
llvm-svn: 310538
------------------------------------------------------------------------
r310241 | vitalybuka | 2017-08-07 00:12:34 -0700 (Mon, 07 Aug 2017) | 3 lines
[asan] Disable checking of arguments passed by value for --asan-force-dynamic-shadow
Fails with "Instruction does not dominate all uses!"
------------------------------------------------------------------------
------------------------------------------------------------------------
r310242 | vitalybuka | 2017-08-07 00:35:33 -0700 (Mon, 07 Aug 2017) | 1 line
[asan] Fix asan dynamic shadow check before copyArgsPassedByValToAllocas
------------------------------------------------------------------------
llvm-svn: 310309
------------------------------------------------------------------------
r310158 | rtrieu | 2017-08-04 17:54:19 -0700 (Fri, 04 Aug 2017) | 8 lines
[ODRHash] Treat some non-templated classes as templated.
When using nested classes, if the inner class is not templated, but the outer
class is templated, the inner class will not be templated, but may have some
traits as if it were. This is particularly evident if the inner class
refers to the outer class in some fashion. Treat any class that is in the
context of a templated class as also a templated class.
------------------------------------------------------------------------
llvm-svn: 310302
------------------------------------------------------------------------
r310191 | ctopper | 2017-08-05 16:35:54 -0700 (Sat, 05 Aug 2017) | 18 lines
[X86] Enable isel to use the PAUSE instruction even when SSE2 is disabled. Clang part
Summary:
On older processors this instruction encoding is treated as a NOP.
MSVC doesn't disable intrinsics based on features the way clang/gcc does. Because the PAUSE instruction encoding doesn't crash older processors, some software out there uses these intrinsics without checking for SSE2.
This change also seems to also be consistent with gcc behavior.
Fixes PR34079
Reviewers: RKSimon, zvi
Reviewed By: RKSimon
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36362
------------------------------------------------------------------------
llvm-svn: 310294
------------------------------------------------------------------------
r310190 | ctopper | 2017-08-05 16:34:44 -0700 (Sat, 05 Aug 2017) | 18 lines
[X86] Enable isel to use the PAUSE instruction even when SSE2 is disabled
Summary:
On older processors this instruction encoding is treated as a NOP.
MSVC doesn't disable intrinsics based on features the way clang/gcc does. Because the PAUSE instruction encoding doesn't crash older processors, some software out there uses these intrinsics without checking for SSE2.
This change also seems to also be consistent with gcc behavior.
Fixes PR34079
Reviewers: RKSimon, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36361
------------------------------------------------------------------------
llvm-svn: 310293
------------------------------------------------------------------------
r310071 | rnk | 2017-08-04 10:09:11 -0700 (Fri, 04 Aug 2017) | 8 lines
[ArgPromotion] Preserve alignment of byval argument in new alloca
The frontend may have requested a higher alignment for any reason, and
downstream optimizations may already have taken advantage of it. We
should keep the same alignment when moving the allocation from the
parameter area to the local variable area.
Fixes PR34038
------------------------------------------------------------------------
llvm-svn: 310292
------------------------------------------------------------------------
r309838 | marshall | 2017-08-02 10:31:09 -0700 (Wed, 02 Aug 2017) | 1 line
Fix PR33727: std::basic_stringbuf only works with DefaultConstructible allocators. Thanks to Jonathan Wakely for the report and suggested fix
------------------------------------------------------------------------
llvm-svn: 310287
------------------------------------------------------------------------
r309296 | marshall | 2017-07-27 10:44:03 -0700 (Thu, 27 Jul 2017) | 1 line
Implement P0739R0: 'Some improvements to class template argument deduction integration into the standard library' This is an API change (not ABI change) due to a late change in the c++17 standard
------------------------------------------------------------------------
------------------------------------------------------------------------
r309307 | marshall | 2017-07-27 11:47:35 -0700 (Thu, 27 Jul 2017) | 1 line
Disable the deduction guide test I added in 309296 for the moment, while I figure out which compilers don't support deduction guides
------------------------------------------------------------------------
llvm-svn: 310286
------------------------------------------------------------------------
r310057 | smaksimovic | 2017-08-04 05:37:34 -0700 (Fri, 04 Aug 2017) | 8 lines
Revert r304953 for release 5.0.0
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
------------------------------------------------------------------------
llvm-svn: 310074
------------------------------------------------------------------------
r309920 | ericwf | 2017-08-02 21:28:10 -0700 (Wed, 02 Aug 2017) | 5 lines
Fix libcxx build with glibc 2.26+ by removing xlocale.h include.
Patch by Khem Raj. Reviewed as D35697. Also see PR33729.
------------------------------------------------------------------------
llvm-svn: 310068
------------------------------------------------------------------------
r309975 | rsmith | 2017-08-03 12:24:27 -0700 (Thu, 03 Aug 2017) | 4 lines
Don't emit undefined-internal warnings for CXXDeductionGuideDecls.
Patch by ~paul (cynecx on phabricator)! Some test massaging by me.
------------------------------------------------------------------------
llvm-svn: 310067
------------------------------------------------------------------------
r309917 | ericwf | 2017-08-02 19:50:43 -0700 (Wed, 02 Aug 2017) | 4 lines
Add system header pragma to BSD locale fallback headers.
This prevent leaking warnings to the user about use of C++11
extensions in C++03.
------------------------------------------------------------------------
llvm-svn: 309958
------------------------------------------------------------------------
r309651 | inouehrs | 2017-07-31 20:32:15 -0700 (Mon, 31 Jul 2017) | 16 lines
[StackColoring] Update AliasAnalysis information in stack coloring pass
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp
// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.
But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.
This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.
------------------------------------------------------------------------
------------------------------------------------------------------------
r309849 | inouehrs | 2017-08-02 11:16:32 -0700 (Wed, 02 Aug 2017) | 19 lines
[StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)
This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments.
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp
// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.
But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.
This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.
This patch fixes PR33928.
------------------------------------------------------------------------
llvm-svn: 309957
------------------------------------------------------------------------
r309930 | sdardis | 2017-08-03 02:38:46 -0700 (Thu, 03 Aug 2017) | 19 lines
[SelectionDAG] Resolve PR33978.
rL306209 taught SelectionDAG how to add the dereferenceable flag when
expanding memcpy and memmove. The fix however contained a nit where
the offset + size was constructed as an APInt of PointerSize rather
than PointerSizeInBits.
This lead to isDereferenceableAndAlignedPointer() get truncated values or
values which would be sign extended within that function leading to
incorrect results.
Thanks to Alex Crichton for reporting the issue!
This resolves PR33978.
Reviewers: inouehrs
Differential Revision: https://reviews.llvm.org/D36236
------------------------------------------------------------------------
llvm-svn: 309956
------------------------------------------------------------------------
r309523 | brad | 2017-07-30 14:13:59 -0700 (Sun, 30 Jul 2017) | 2 lines
Also pass -pie back to the linker when linking on OpenBSD.
------------------------------------------------------------------------
llvm-svn: 309844
------------------------------------------------------------------------
r309744 | mstorsjo | 2017-08-01 14:13:54 -0700 (Tue, 01 Aug 2017) | 29 lines
[AArch64] Rewrite stack frame handling for win64 vararg functions
The previous attempt, which made do with a single offset in
computeCalleeSaveRegisterPairs, wasn't quite enough. The previous
attempt only worked as long as CombineSPBump == true (since the
offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).
Instead include the size for the fixed stack area used for win64
varargs in calculations in emitPrologue/emitEpilogue. The stack
consists of mainly three parts;
- AFI->getLocalStackSize()
- AFI->getCalleeSavedStackSize()
- FixedObject
Most of the places in the code which previously used the CSStackSize
now use PrologueSaveSize instead, which is the sum of the latter
two, while some cases which need exactly the middle one use
AFI->getCalleeSavedStackSize() explicitly instead of a local variable.
In addition to moving the offsetting into emitPrologue/emitEpilogue
(which fixes functions with CombineSPBump == false), also set the
frame pointer to point to the right location, where the frame pointer
and link register actually are stored. In addition to the prologue/epilogue,
this also requires changes to resolveFrameIndexReference.
Add tests for a function that keeps a frame pointer and another one
that uses a VLA.
Differential Revision: https://reviews.llvm.org/D35919
------------------------------------------------------------------------
llvm-svn: 309843
------------------------------------------------------------------------
r309555 | mstorsjo | 2017-07-31 04:18:41 -0700 (Mon, 31 Jul 2017) | 10 lines
[llvm-dlltool] Write correct weak externals
Previously, the created object files for the import library were broken.
Write the symbol table before the string table. Simplify the code by
using a separate variable Prefix instead of duplicating a few lines.
Also update the coff-weak-exports to actually check that the generated
weak symbols can be found as intended.
Differential Revision: https://reviews.llvm.org/D36065
------------------------------------------------------------------------
llvm-svn: 309837
------------------------------------------------------------------------
r309594 | majnemer | 2017-07-31 10:47:07 -0700 (Mon, 31 Jul 2017) | 4 lines
[IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer
We are not allowed to reason about an initializer value without first
consulting hasDefinitiveInitializer.
------------------------------------------------------------------------
llvm-svn: 309827
------------------------------------------------------------------------
r309722 | bruno | 2017-08-01 12:05:25 -0700 (Tue, 01 Aug 2017) | 7 lines
[Sema] Fix lax conversion between non ext vectors
r282968 introduced a regression due to the lack of proper testing.
Re-add lax conversion support between non ext vectors for compound
assignments and add a test for that.
rdar://problem/28639467
------------------------------------------------------------------------
llvm-svn: 309770
------------------------------------------------------------------------
r309561 | sdardis | 2017-07-31 07:06:58 -0700 (Mon, 31 Jul 2017) | 14 lines
[SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.
This resolves PR33883.
Thanks to Alex Crichton for reporting the issue!
Reviewers: zoran.jovanovic, atanasyan
Differential Revision: https://reviews.llvm.org/D35765
------------------------------------------------------------------------
llvm-svn: 309767
------------------------------------------------------------------------
r309495 | fhahn | 2017-07-29 13:35:28 -0700 (Sat, 29 Jul 2017) | 30 lines
[AArch64] Tie source and destination operands for AESMC/AESIMC.
Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
AESE Vn, _
AESMC Vn, Vn
and
AESD Vn, _
AESIMC Vn, Vn
The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.
A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.
I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.
Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga
Reviewed By: t.p.northover, evandro
Subscribers: aemerson, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35299
------------------------------------------------------------------------
llvm-svn: 309765
------------------------------------------------------------------------
r309752 | bruno | 2017-08-01 15:10:36 -0700 (Tue, 01 Aug 2017) | 6 lines
[Headers][Darwin] Allow #include_next<float.h> to work on Darwin prior to 10.7
This fixes PR31504 and it's a follow up from adding #include_next<float.h>
for Darwin in r289018.
rdar://problem/29856682
------------------------------------------------------------------------
llvm-svn: 309764
------------------------------------------------------------------------
r309330 | davide | 2017-07-27 15:20:44 -0700 (Thu, 27 Jul 2017) | 13 lines
[ConstantFolder] Don't try to fold gep when the idx is a vector.
The code in ConstantFoldGetElementPtr() assumes integers, and
therefore it crashes trying to get the integer bidwith of a vector
type (in this case <4 x i32>. I just changed the code to prevent
the folding in case of vectors and I didn't bother to generalize
as this doesn't seem to me something that really happens in
practice, but I'm willing to change the patch if you think
it's worth it.
This is hard to trigger from -instsimplify or -instcombine
only as the second instruction is dead, so the test uses loop-unroll.
Differential Revision: https://reviews.llvm.org/D35956
------------------------------------------------------------------------
llvm-svn: 309595
------------------------------------------------------------------------
r309481 | mgorny | 2017-07-28 23:46:45 -0700 (Fri, 28 Jul 2017) | 13 lines
[OCaml] Install dynamic libraries in 'stubdirs' directory
Install the OCaml dynamic libraries in the 'stubdirs' directory rather
than the llvm subdirectory in order to fix running executables created
by ocamlc. Otherwise, the executables fail to run being unable to locate
the libraries (unless the LLVM directory is explicitly added to
LD_LIBRARY_PATH).
The staging directories are not altered since they work for our
development setup anyway, and installing into two directories would
unnecessarily make the code more complex.
Differential Revision: https://reviews.llvm.org/D35995
------------------------------------------------------------------------
llvm-svn: 309592
------------------------------------------------------------------------
r309483 | mgorny | 2017-07-29 01:10:24 -0700 (Sat, 29 Jul 2017) | 7 lines
[OCaml] Pass -D/-UNDEBUG through to ocamlc
Detect [/-][DU]NDEBUG in CMAKE_C_FLAGS* and pass them through to ocamlc.
This is necessary because their value might affect visibility of dump
functions in LLVM and ocamlc uses its own compiler and flags by default.
Differential Revision: https://reviews.llvm.org/D35898
------------------------------------------------------------------------
llvm-svn: 309591
------------------------------------------------------------------------
r309321 | mgorny | 2017-07-27 14:13:25 -0700 (Thu, 27 Jul 2017) | 12 lines
[OCaml] Fix undefined reference to LLVMDumpType() with NDEBUG
Account for the possibility of LLVMDumpType() not being available with
NDEBUG in the OCaml bindings. If it is not built into LLVM, make
the dump function raise an exception.
Since rL293359, the dump functions are built only if either NDEBUG is
not defined, or LLVM_ENABLE_DUMP is defined. As a result, if the dump
functions are not built in LLVM, the dynamic OCaml libraries fail to
load due to undefined LLVMDumpType symbol.
Differential Revision: https://reviews.llvm.org/D35899
------------------------------------------------------------------------
llvm-svn: 309590
------------------------------------------------------------------------
r309382 | rksimon | 2017-07-28 06:47:02 -0700 (Fri, 28 Jul 2017) | 3 lines
[X86] Add tests showing inability of vector non-temporal load/store intrinsic to force pointer alignment (PR33830)
Clang specifies a max type alignment of 16 bytes on darwin targets, meaning that the builtin nontemporal stores don't correctly align the loads/stores to 32 or 64 bytes when required, resulting in lowering to temporal unaligned loads/stores.
------------------------------------------------------------------------
Merging r309383:
------------------------------------------------------------------------
r309383 | rksimon | 2017-07-28 07:01:51 -0700 (Fri, 28 Jul 2017) | 1 line
Strip trailing whitespace. NFCI.
------------------------------------------------------------------------
Merging r309488:
------------------------------------------------------------------------
r309488 | rksimon | 2017-07-29 08:33:34 -0700 (Sat, 29 Jul 2017) | 7 lines
[X86][AVX] Ensure vector non-temporal load/store intrinsics force pointer alignment (PR33830)
Clang specifies a max type alignment of 16 bytes on darwin targets (annoyingly in the driver not via cc1), meaning that the builtin nontemporal stores don't correctly align the loads/stores to 32 or 64 bytes when required, resulting in lowering to temporal unaligned loads/stores.
This patch casts the vectors to explicitly aligned types prior to the load/store to ensure that the require alignment is respected.
Differential Revision: https://reviews.llvm.org/D35996
------------------------------------------------------------------------
llvm-svn: 309588
------------------------------------------------------------------------
r309325 | ab | 2017-07-27 14:28:59 -0700 (Thu, 27 Jul 2017) | 8 lines
[X86] Don't lie about legality to TLI's demanded bits.
Like r309323, X86 had a typo where it passed the wrong flags to TLO.
Found by inspection; I haven't been able to tickle this into having
observable behavior. I don't think it does, given that X86 doesn't have
custom demanded bits logic, and the generic logic doesn't have a lot of
exposure to illegal constructs.
------------------------------------------------------------------------
llvm-svn: 309587
------------------------------------------------------------------------
r309323 | ab | 2017-07-27 14:27:25 -0700 (Thu, 27 Jul 2017) | 12 lines
[AArch64] Fix legality info passed to demanded bits for TBI opt.
The (seldom-used) TBI-aware optimization had a typo lying dormant since
it was first introduced, in r252573: when asking for demanded bits, it
told TLI that it was running after legalize, where the opposite was
true.
This is an important piece of information, that the demanded bits
analysis uses to make assumptions about the node. r301019 added such an
assumption, which was broken by the TBI combine.
Instead, pass the correct flags to TLO.
------------------------------------------------------------------------
llvm-svn: 309586
------------------------------------------------------------------------
r309503 | rsmith | 2017-07-29 23:31:29 -0700 (Sat, 29 Jul 2017) | 6 lines
PR33902: Invalidate line number cache when adding more text to existing buffer.
This led to crashes as the line number cache would report a bogus line number
for a line of code, and we'd try to find a nonexistent column within the line
when printing diagnostics.
------------------------------------------------------------------------
llvm-svn: 309580
------------------------------------------------------------------------
r309343 | rnk | 2017-07-27 17:58:35 -0700 (Thu, 27 Jul 2017) | 16 lines
[X86] Fix latent bug in sibcall eligibility logic
The X86 tail call eligibility logic was correct when it was written, but
the addition of inalloca and argument copy elision broke its
assumptions. It was assuming that fixed stack objects were immutable.
Currently, we aim to emit a tail call if no arguments have to be
re-arranged in memory. This code would trace the outgoing argument
values back to check if they are loads from an incoming stack object.
If the stack argument is immutable, then we won't need to store it back
to the stack when we tail call.
Fortunately, stack objects track their mutability, so we can just make
the obvious check to fix the bug.
This was http://crbug.com/749826
------------------------------------------------------------------------
llvm-svn: 309577
------------------------------------------------------------------------
r309422 | rnk | 2017-07-28 12:48:40 -0700 (Fri, 28 Jul 2017) | 25 lines
Fix conditional tail call branch folding when both edges are the same
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:
BB#1: derived from LLVM BB %entry
Live Ins: %EFLAGS
Predecessors according to CFG: BB#0
JE_1 <BB#5>, %EFLAGS<imp-use,kill>
JMP_1 <BB#5>
BB#5: derived from LLVM BB %sw.epilog
Predecessors according to CFG: BB#1
TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...
We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.
This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.
Fixes PR33980
------------------------------------------------------------------------
llvm-svn: 309574
------------------------------------------------------------------------
r309353 | davide | 2017-07-27 19:57:43 -0700 (Thu, 27 Jul 2017) | 3 lines
[JumpThreading] Add an option to dump LazyValueInfo after the run.
Differential Revision: https://reviews.llvm.org/D35973
------------------------------------------------------------------------
------------------------------------------------------------------------
r309355 | davide | 2017-07-27 20:10:43 -0700 (Thu, 27 Jul 2017) | 24 lines
[JumpThreading] Stop falsely preserving LazyValueInfo.
JumpThreading claims to preserve LVI, but it doesn't preserve
the analyses which LVI holds a reference to (e.g. the Dominator).
In the current pass manager infrastructure, after JT runs, the
PM frees these analyses (including DominatorTree) but preserves
LVI.
CorrelatedValuePropagation runs immediately after and queries
a corrupted domtree, causing weird miscompiles.
This commit disables the preservation of LVI for the time being.
Eventually, we should either move LVI to a proper dependency
tracking mechanism (i.e. an analyses shouldn't hold references
to other analyses and compute them on demand if needed), or
we should teach all the passes preserving LVI to preserve the
analyses LVI depends on.
The new pass manager has a mechanism to invalidate LVI in case
one of the analyses it depends on becomes invalid, so this problem
shouldn't exist (at least not in this immediate form), but handling
of analyses holding references is still a very delicate subject.
Fixes PR33917 (and rustc).
------------------------------------------------------------------------
llvm-svn: 309439
------------------------------------------------------------------------
r309113 | yamaguchi | 2017-07-26 06:36:58 -0700 (Wed, 26 Jul 2017) | 19 lines
[Bash-autocompletion] Show HelpText with possible flags
Summary:
`clang --autocomplete=-std` will show
```
-std: Language standard to compile for
-std= Language standard to compile for
-stdlib= C++ standard library to use
```
after this change.
However, showing HelpText with completion in bash seems super tricky, so
this feature will be used in other shells (fish, zsh...).
Reviewers: v.g.vassilev, teemperor, ruiu
Subscribers: cfe-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D35759
------------------------------------------------------------------------
llvm-svn: 309438
------------------------------------------------------------------------
r309113 | yamaguchi | 2017-07-26 06:36:58 -0700 (Wed, 26 Jul 2017) | 19 lines
[Bash-autocompletion] Show HelpText with possible flags
Summary:
`clang --autocomplete=-std` will show
```
-std: Language standard to compile for
-std= Language standard to compile for
-stdlib= C++ standard library to use
```
after this change.
However, showing HelpText with completion in bash seems super tricky, so
this feature will be used in other shells (fish, zsh...).
Reviewers: v.g.vassilev, teemperor, ruiu
Subscribers: cfe-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D35759
------------------------------------------------------------------------
llvm-svn: 309437
------------------------------------------------------------------------
r309112 | yamaguchi | 2017-07-26 06:30:36 -0700 (Wed, 26 Jul 2017) | 7 lines
[Bash-completion] Fixed a bug that file doesn't autocompleted after =
Summary:
File path wasn't autocompleted after `-fmodule-cache-path=[tab]`, so
fixed this bug by checking if $flags contains only a newline or not.
Differential Revision: https://reviews.llvm.org/D35763
------------------------------------------------------------------------
llvm-svn: 309435
------------------------------------------------------------------------
r309302 | rksimon | 2017-07-27 11:15:54 -0700 (Thu, 27 Jul 2017) | 3 lines
[SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)
Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.
------------------------------------------------------------------------
llvm-svn: 309348
------------------------------------------------------------------------
r309327 | compnerd | 2017-07-27 14:56:25 -0700 (Thu, 27 Jul 2017) | 5 lines
Headers: fix _Unwind_{G,S}etGR for non-EHABI targets
The EHABI definition was being inlined into the users even when EHABI
was not in use. Adjust the condition to ensure that the right version
is defined.
------------------------------------------------------------------------
llvm-svn: 309328
------------------------------------------------------------------------
r309226 | compnerd | 2017-07-26 15:55:23 -0700 (Wed, 26 Jul 2017) | 13 lines
Headers: improve ARM EHABI coverage of unwind.h
Ensure that we define the `_Unwind_Control_Block` structure used on ARM
EHABI targets. This is needed for building libc++abi with the unwind.h
from the resource dir. A minor fallout of this is that we needed to
create a typedef for _Unwind_Exception to work across ARM EHABI and
non-EHABI targets. The structure definitions here are based originally
on the documentation from ARM under the "Exception Handling ABI for the
ARM® Architecture" Section 7.2. They are then adjusted to more closely
reflect the definition in libunwind from LLVM. Those changes are
compatible in layout but permit easier use in libc++abi and help
maintain compatibility between libunwind and the compiler provided
definition.
------------------------------------------------------------------------
llvm-svn: 309290
------------------------------------------------------------------------
r308808 | arsenm | 2017-07-21 16:56:13 -0700 (Fri, 21 Jul 2017) | 6 lines
RA: Remove assert on empty live intervals
This is possible if there is an undef use when
splitting the vreg during spilling.
Fixes bug 33620.
------------------------------------------------------------------------
------------------------------------------------------------------------
r308813 | arsenm | 2017-07-21 17:24:01 -0700 (Fri, 21 Jul 2017) | 6 lines
RA: Remove another assert on empty intervals
This case is similar to the one fixed in r308808,
except when rematerializing.
Fixes bug 33884.
------------------------------------------------------------------------
------------------------------------------------------------------------
r308906 | arsenm | 2017-07-24 11:07:55 -0700 (Mon, 24 Jul 2017) | 6 lines
RA: Replace asserts related to empty live intervals
These don't exactly assert the same thing anymore, and
allow empty live intervals with non-empty uses.
Removed in r308808 and r308813.
------------------------------------------------------------------------
llvm-svn: 309171
------------------------------------------------------------------------
r308903 | arsenm | 2017-07-24 11:06:15 -0700 (Mon, 24 Jul 2017) | 5 lines
AMDGPU: Fix allocating pseudo-registers
There's no need for these to be part of a class since
they are immediately replaced. New unreachable hit in
existing tests.'
------------------------------------------------------------------------
llvm-svn: 309157
------------------------------------------------------------------------
r308871 | chill | 2017-07-24 02:19:32 -0700 (Mon, 24 Jul 2017) | 17 lines
[libunwind] Handle .ARM.exidx tables without sentinel last entry
UnwindCursor<A, R>::getInfoFromEHABISection assumes the last
entry in the index table never corresponds to a real function.
Indeed, GNU ld always inserts an EXIDX_CANTUNWIND entry,
containing the end of the .text section. However, the EHABI specification
(http://infocenter.arm.com/help/topic/com.arm.doc.ihi0038b/IHI0038B_ehabi.pdf)
does not seem to contain text that requires the presence of a sentinel entry.
In that sense the libunwind implementation isn't compliant with the specification.
This patch makes getInfoFromEHABISection examine the last entry in the index
table if upper_bound returns the end iterator.
Fixes https://bugs.llvm.org/show_bug.cgi?id=31091
Differential revision: https://reviews.llvm.org/D35265
------------------------------------------------------------------------
llvm-svn: 309143
------------------------------------------------------------------------
r309058 | majnemer | 2017-07-25 16:33:58 -0700 (Tue, 25 Jul 2017) | 9 lines
[CodeGen] Correctly model std::byte's aliasing properties
std::byte, when defined as an enum, needs to be given special treatment
with regards to its aliasing properties. An array of std::byte is
allowed to be used as storage for other types.
This fixes PR33916.
Differential Revision: https://reviews.llvm.org/D35824
------------------------------------------------------------------------
llvm-svn: 309135
------------------------------------------------------------------------
r308950 | mstorsjo | 2017-07-24 22:20:01 -0700 (Mon, 24 Jul 2017) | 22 lines
[AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs
Create a dummy 8 byte fixed object for the unused slot below the first
stored vararg.
Alternative ideas tested but skipped: One could try to align the whole
fixed object to 16, but I haven't found how to add an offset to the stack
frame used in LowerWin64_VASTART.
If only the size of the fixed stack object size is padded but not the offset, via
MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false),
PrologEpilogInserter crashes due to "Attempted to reset backwards range!".
This fixes misconceptions about where registers are spilled, since
AArch64FrameLowering.cpp assumes the offset from fixed objects is
aligned to 16 bytes (and the Win64 case there already manually aligns
the offset to 16 bytes).
This fixes cases where local stack allocations could overwrite callee
saved registers on the stack.
Differential Revision: https://reviews.llvm.org/D35720
------------------------------------------------------------------------
llvm-svn: 309132
------------------------------------------------------------------------
r308891 | d0k | 2017-07-24 09:18:09 -0700 (Mon, 24 Jul 2017) | 16 lines
[CodeGenPrepare] Cut off FindAllMemoryUses if there are too many uses.
This avoids excessive compile time. The case I'm looking at is
Function.cpp from an old version of LLVM that still had the giant memcmp
string matcher in it. Before r308322 this compiled in about 2 minutes,
after it, clang takes infinite* time to compile it. With this patch
we're at 5 min, which is still bad but this is a pathological case.
The cut off at 20 uses was chosen by looking at other cut-offs in LLVM
for user scanning. It's probably too high, but does the job and is very
unlikely to regress anything.
Fixes PR33900.
* I'm impatient and aborted after 15 minutes, on the bug report it was
killed after 2h.
------------------------------------------------------------------------
llvm-svn: 309131
------------------------------------------------------------------------
r308998 | nico | 2017-07-25 11:08:03 -0700 (Tue, 25 Jul 2017) | 7 lines
lld: only write .manifest files if /manifest is passed, PR33925
Also emit an error if /manifestinput: is used without /manifest:embed.
Increases compatibility with link.exe
https://reviews.llvm.org/D35842
------------------------------------------------------------------------
------------------------------------------------------------------------
r309002 | nico | 2017-07-25 11:39:38 -0700 (Tue, 25 Jul 2017) | 6 lines
Attempt to fix lld tests on Windows after 308998.
The test used /manifestinput: without /manifest:embed, which isn't actually
supported. Just remove this part of the test for now; if it's important to
check this the llvm-readobj part should be extended to check this.
------------------------------------------------------------------------
llvm-svn: 309128
------------------------------------------------------------------------
r308963 | rksimon | 2017-07-25 03:33:36 -0700 (Tue, 25 Jul 2017) | 1 line
[X86] Add 24-byte memcmp tests (PR33914)
------------------------------------------------------------------------
------------------------------------------------------------------------
r308986 | rksimon | 2017-07-25 10:04:37 -0700 (Tue, 25 Jul 2017) | 9 lines
[X86][CGP] Reduce memcmp() expansion to 2 load pairs (PR33914)
D35067/rL308322 attempted to support up to 4 load pairs for memcmp inlining which resulted in regressions for some optimized libc memcmp implementations (PR33914).
Until we can match these more optimal cases, this patch reduces the memcmp expansion to a maximum of 2 load pairs (which matches what we do for -Os).
This patch should be considered for the 5.0.0 release branch as well
Differential Revision: https://reviews.llvm.org/D35830
------------------------------------------------------------------------
llvm-svn: 309127
------------------------------------------------------------------------
r309115 | hahnfeld | 2017-07-26 06:55:00 -0700 (Wed, 26 Jul 2017) | 15 lines
[CMake] Disable building libomptarget and add CMake switch
Introduce OPENMP_ENABLE_LIBOMPTARGET which defaults to OFF at the moment.
libomptarget is not yet ready for prime time:
- Offloading to NVIDIA GPUs is not completed yet (compiler, device RTL)
- The generic ELF plugin for offloading to the host (meant for testing)
uses a single instance of the OpenMP runtime (libomp). That is why
omp_is_initial_device() returns 1 which makes the tests fail.
Because of these reasons, we want to disable building (and testing!)
for release 5.0.
See https://bugs.llvm.org/show_bug.cgi?id=33859
Differential Revision: https://reviews.llvm.org/D35719
------------------------------------------------------------------------
llvm-svn: 309126
------------------------------------------------------------------------
r308912 | tstellar | 2017-07-24 15:28:30 -0400 (Mon, 24 Jul 2017) | 14 lines
test-release.sh: Fix phase2 and phase3 binary comparision
Summary:
scudo_utils.cpp.o from compiler-rt has one of the host compiler's builtin
include paths stored in the .debug_line section. So we need to do
sed 's,Phase1,Phase2,g` on the Phase2 object file so it matches Phase3.
Reviewers: hans
Reviewed By: hans
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34989
------------------------------------------------------------------------
llvm-svn: 309003
------------------------------------------------------------------------
r308897 | nico | 2017-07-24 09:54:11 -0700 (Mon, 24 Jul 2017) | 9 lines
Work around an MSVC2017 update 3 codegen bug.
C2017 update 3 produces a clang that crashes when compiling clang. Disabling
optimizations for StmtProfiler::VisitCXXOperatorCallExpr() makes the crash go
away.
Patch from Bruce Dawson <brucedawson@chromium.org>!
https://reviews.llvm.org/D35757
------------------------------------------------------------------------
llvm-svn: 308988
------------------------------------------------------------------------
r308492 | rafael | 2017-07-19 09:45:05 -0700 (Wed, 19 Jul 2017) | 20 lines
Bring back r307364.
In addition this includes a change to prefer symbols with a default
version @@ over unversioned symbols.
Original commit message:
[ELF] - Handle symbols with default version early.
This fixes last testcase provided in PR28414.
In short issue is next: when we had X@@Version symbol in object A,
we did not resolve it to X early. Then when in another object B
we had reference to undefined X, symbol X from archive was fetched.
Since both archive and object A contains another symbol Z, duplicate
symbol definition was triggered as a result.
Correct behavior is to use X@@Version from object A instead and do not fetch
any symbols from archive.
Differential revision: https://reviews.llvm.org/D35059
------------------------------------------------------------------------
llvm-svn: 308788
------------------------------------------------------------------------
r308503 | davide | 2017-07-19 11:09:46 -0700 (Wed, 19 Jul 2017) | 3 lines
[X86] Don't try to scale down if that exceeds the bitwidth.
Fixes the crash reported in PR33844.
------------------------------------------------------------------------
llvm-svn: 308718
------------------------------------------------------------------------
r308512 | pfaffe | 2017-07-19 12:20:58 -0700 (Wed, 19 Jul 2017) | 4 lines
[CMake] Fix r307650: Readd missing dependency.
The commit erroneously removed the dependency of the Polly tests on
things like opt and FileCheck. Add that dependency back.
------------------------------------------------------------------------
llvm-svn: 308717
------------------------------------------------------------------------
r308483 | hans | 2017-07-19 08:03:38 -0700 (Wed, 19 Jul 2017) | 12 lines
Defeat a GCC -Wunused-result warning
It was warning like:
../llvm-project/llvm/lib/Support/ErrorHandling.cpp:172:51: warning:
ignoring return value of ‘ssize_t write(int, const void*, size_t)’,
declared with attribute warn_unused_result [-Wunused-result]
(void)::write(2, OOMMessage, strlen(OOMMessage));
Work around the warning by storing the return value in a variable and
casting that to void instead. We already did this for the other write()
call in this file.
------------------------------------------------------------------------
llvm-svn: 308487
------------------------------------------------------------------------
r308455 | hans | 2017-07-19 05:31:01 -0700 (Wed, 19 Jul 2017) | 16 lines
Revert r308441 "Recommit r308327: Add a warning for missing '#pragma pack (pop)' and suspicious uses of '#pragma pack' in included files"
This seems to have broken the sanitizer-x86_64-linux buildbot. Reverting until
it's fixed, especially since this landed just before the 5.0 branch.
> This commit adds a new -Wpragma-pack warning. It warns in the following cases:
>
> - When a translation unit is missing terminating #pragma pack (pop) directives.
> - When entering an included file if the current alignment value as determined
> by '#pragma pack' directives is different from the default alignment value.
> - When leaving an included file that changed the state of the current alignment
> value.
>
> rdar://10184173
>
> Differential Revision: https://reviews.llvm.org/D35484
------------------------------------------------------------------------
llvm-svn: 308457
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.