Commit Graph

84609 Commits

Author SHA1 Message Date
Diego Novillo
0354a9f67b SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.

The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).

This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.

llvm-svn: 252763
2015-11-11 17:54:37 +00:00
Hemant Kulkarni
c6638c7561 [Symbolizer]: Add -pretty-print option
Differential Revision: http://reviews.llvm.org/D13671

llvm-svn: 252760
2015-11-11 17:47:54 +00:00
Sanjay Patel
f740129198 [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any MIPS32
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:
  jr  $ra
  clz  $2, $4

cttz:
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  jr  $ra
  subu  $2, $2, $1

Instead of:

ctlz:
  beqz  $4, $BB0_2
  addiu  $2, $zero, 32
  clz  $2, $4
$BB0_2:
  jr  $ra
  nop

cttz:
  beqz  $4, $BB1_2
  addiu  $2, $zero, 32
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  subu  $2, $2, $1
$BB1_2:
  jr  $ra
  nop

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14500

llvm-svn: 252755
2015-11-11 17:24:56 +00:00
Diego Novillo
0767ae5896 Properly fix unused variable in disable-assert builds.
I missed the side-effects of ParseBFI in my previous attempt (r252748).
Thanks dblaikie for the suggestion of adding a void use of the unused
variable instead.

llvm-svn: 252751
2015-11-11 16:39:22 +00:00
Diego Novillo
29f88a2460 Remove unused variable in disable-assert builds. NFC.
llvm-svn: 252748
2015-11-11 16:14:52 +00:00
Douglas Katzman
a14039764b Visibly fail if attempting to encode register AH,BH,CH,DH in a REX-prefixed instruction.
Differential Revision: http://reviews.llvm.org/D13316
Fixes PR25003

llvm-svn: 252743
2015-11-11 15:51:16 +00:00
James Molloy
ce12c92f66 [ARM] Combine BFIs together
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.

llvm-svn: 252740
2015-11-11 15:40:40 +00:00
Charlie Turner
d82c9389e7 [SLP] Enable -slp-vectorize-hor by default.
Measurements primarily on AArch64 have shown this feature does not
significantly effect compile-time. The are no significant perf changes in LNT,
but for AArch64 at least, there are wins in third party benchmarks.

As discussed on llvm-dev, we're going to try turning this on by default and see
how other targets react to the change.

llvm-svn: 252733
2015-11-11 15:03:46 +00:00
Aaron Ballman
470b5f1a79 Silencing a signed vs unsigned type mismatch warning.
llvm-svn: 252732
2015-11-11 14:57:28 +00:00
Aaron Ballman
107bb0d193 Silencing nine warnings for "enumeral and non-enumeral type in conditional expression"; NFC.
llvm-svn: 252728
2015-11-11 13:44:06 +00:00
Michael Kuperstein
12982a816c [X86] Replace LEAs with INC/DEC when profitable
If possible and profitable, replace lea %reg, 1(%reg) and lea %reg, -1(%reg) with inc %reg and dec %reg respectively.

Patch by: anton.nadolsky@intel.com
Differential Revision: http://reviews.llvm.org/D14059

llvm-svn: 252722
2015-11-11 11:44:31 +00:00
Yury Gribov
d7731988ef [ASan] Enable optional ASan recovery.
Differential Revision: http://reviews.llvm.org/D14242

llvm-svn: 252719
2015-11-11 10:36:49 +00:00
Craig Topper
b24a58e28f [X86] Fix feature flags on some MMX register instructions that really were introduced with SSE or SSE2.
llvm-svn: 252709
2015-11-11 07:29:25 +00:00
Craig Topper
700a1a23d7 [X86] Remove redundant MMX isel patterns.
llvm-svn: 252708
2015-11-11 07:29:22 +00:00
Dan Gohman
754cd11d90 [WebAssembly] Support non-legal argument and return types.
llvm-svn: 252687
2015-11-11 01:33:02 +00:00
Ahmed Bougacha
4a85643907 [MC] Use LShr for constant evaluation of ">>" on non-arm64 darwin.
Follow-up to r235963: this matches other assemblers and is less
unexpected (e.g. PR23227).

llvm-svn: 252681
2015-11-11 00:51:36 +00:00
Matthias Braun
2c98d0f477 MachineInstr: addRegisterDefReadUndef() => setRegisterDefReadUndef()
This way we can not only add but also remove read undef flags.

llvm-svn: 252678
2015-11-11 00:41:58 +00:00
Matt Arsenault
8246d4aead AMDGPU: Print more fields in comments
llvm-svn: 252677
2015-11-11 00:27:46 +00:00
Sanjoy Das
dc26df4abe [ValueTracking] Remove untested / unreachable code, NFC
Right now isTruePredicate is only ever called with Pred == ICMP_SLE or
ICMP_ULE, and the ICMP_SLT and ICMP_ULT cases are dead.  This change
removes the untested dead code so that the function is not misleading.

llvm-svn: 252676
2015-11-11 00:16:41 +00:00
Matt Arsenault
61cb6fa848 AMDGPU: Remove dead code
llvm-svn: 252675
2015-11-11 00:01:36 +00:00
Matt Arsenault
6690d7de39 AMDGPU: Set isAllocatable = 0 on VS_32/VS_64
llvm-svn: 252674
2015-11-11 00:01:32 +00:00
Sanjoy Das
925681053d [ValueTracking] Teach isImpliedCondition a new bitwise trick
Summary:
This change teaches isImpliedCondition to prove things like

  (A | 15) < L  ==>  (A | 14) < L

if the low 4 bits of A are known to be zero.

Depends on D14391

Reviewers: majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14392

llvm-svn: 252673
2015-11-10 23:56:20 +00:00
Sanjoy Das
af1400f84b [ValueTracking] Use m_APInt instead of m_ConstantInt, NFC
This change would add functionality if isImpliedCondition worked on
vector types; but since it bail out on vector predicates this change is
an NFC.

llvm-svn: 252672
2015-11-10 23:56:15 +00:00
Matthias Braun
4353b30542 TableGen: Emit LaneMask for register classes without subregisters as ~0u
This makes it slightly easier to handle classes with and without
subregister uniformly.

llvm-svn: 252671
2015-11-10 23:23:05 +00:00
Reid Kleckner
7f84a939ed [WinEH] Insert the MBB for EH_RESTORE after the catchret
Inserting it before the target block could be bad, we might already have
a fallthrough edge to it.

llvm-svn: 252670
2015-11-10 23:22:20 +00:00
Kostya Serebryany
b7e286bed7 [libFuzzer] add UninstrumentedTest.cpp (missing from a previous commit)
llvm-svn: 252658
2015-11-10 22:02:56 +00:00
Dan Gohman
16d314d300 [WebAssembly] Remove special cases for things that are no longer special. NFC.
llvm-svn: 252656
2015-11-10 21:48:21 +00:00
Bill Schmidt
3c44c6f189 Add PPCMIPeephole.cpp to CMakeLists.txt
llvm-svn: 252654
2015-11-10 21:43:45 +00:00
Dan Gohman
b84ae9bb38 [WebAssembly] Support for floating point min and max.
llvm-svn: 252653
2015-11-10 21:40:21 +00:00
Bill Schmidt
34af5e1c76 [PowerPC] Add an MI SSA peephole pass.
This patch adds a pass for doing PowerPC peephole optimizations at the
MI level while the code is still in SSA form.  This allows for easy
modifications to the instructions while depending on a subsequent pass
of DCE.  Both passes are very fast due to the characteristics of SSA.

At this time, the only peepholes added are for cleaning up various
redundancies involving the XXPERMDI instruction.  However, I would
expect this will be a useful place to add more peepholes for
inefficiencies generated during instruction selection.  The pass is
placed after VSX swap optimization, as it is best to let that pass
remove unnecessary swaps before performing any remaining clean-ups.

The utility of these clean-ups are demonstrated by changes to four
existing test cases, all of which now have tighter expected code
generation.  I've also added Eric Schweiz's bugpoint-reduced test from
PR25157, for which we now generate tight code.  One other test started
failing for me, and I've fixed it
(test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not
related to my changes, and I'm not sure why it works before and not
after.  The problem is that the CHECK-NOT: of "statepoint" from test1
fails because of the "statepoint" in test2, and so forth.  Adding a
CHECK-LABEL in between keeps the different occurrences of that string
properly scoped.

llvm-svn: 252651
2015-11-10 21:38:26 +00:00
Teresa Johnson
2d5fb8cac4 Ensure ModuleLinker materializes complete comdat groups
Summary:
The module linker lazy links some "discardable if unused" global
values (e.g. linkonce), materializing and linking them only
if they are referenced in the module. If a comdat group contains a
linkonce member that is not referenced, however, it would not be
materialized and linked, leading to an incomplete comdat group.

If there are other object files not part of the same LTO link that also
define and use that comdat group, the linker may select the incomplete
group leading to link time unsats.

To solve this, whenever a global value body is linked, make sure we
materialize any other members of the same comdat group that are not yet
materialized. This ensures they are in the lazy link list and get linked
as well.

Added new test and adjusted old test to remove parts that didn't
make sense with fix.

Reviewers: rafael

Subscribers: dexonsmith, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14516

llvm-svn: 252647
2015-11-10 21:09:06 +00:00
Sanjoy Das
bd1c1bfbd2 [IR] Make {Call,Invoke}::cloneImpl aware of operand bundles
This was an omission in the patch that landed initial support for
operand bundles.  So far we haven't hit this, but we will once the
inliner is able to inline calls to functions that contain calls with
operand bundles.

llvm-svn: 252645
2015-11-10 20:13:21 +00:00
Sanjoy Das
b9ca6dcc6b [OperandBundles] Identify operand bundles with both their names and IDs
No code uses this functionality yet.  This change just exposes
information / structure that was already present.

llvm-svn: 252644
2015-11-10 20:13:15 +00:00
Sanjay Patel
33ec5dbe35 less indent; NFCI
llvm-svn: 252643
2015-11-10 20:09:02 +00:00
Sanjay Patel
af1b48bfdc [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:               
  clz  r0, r0
  bx  lr
cttz:              
  rbit  r0, r0
  clz  r0, r0
  bx  lr

Instead of:

ctlz:    
  cmp  r0, #0
  moveq  r0, #32
  clzne  r0, r0
  bx  lr
cttz:     
  cmp   r0, #0
  moveq  r0, #32
  rbitne  r0, r0
  clzne  r0, r0
  bx  lr

This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818

Differential Revision: http://reviews.llvm.org/D14469

llvm-svn: 252639
2015-11-10 19:24:31 +00:00
Matt Arsenault
aa118e299c LegalizeDAG: Implement promote for scalar_to_vector
This allows avoiding the default Expand behavior which
introduces stack usage. Bitcast the scalar and replace
the missing elements with undef.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252632
2015-11-10 18:48:11 +00:00
Matt Arsenault
a46aa641f2 LegalizeDAG: Implement promote for insert_vector_elt
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252631
2015-11-10 18:48:08 +00:00
Matt Arsenault
0b7958a59b LegalizeDAG: Implement promote for extract_vector_elt
This is for AMDGPU to implement v2i64 extract as extract of
half of a v4i32.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252630
2015-11-10 18:48:04 +00:00
Philip Reames
2d858747df [ValueTracking] Recognize that and(x, add (x, -1)) clears the low bit
This is a cleaned up version of a patch by John Regehr with permission. Originally found via the souper tool.

If we add an odd number to x, then bitwise-and the result with x, we know that the low bit of the result must be zero. Either it was zero in x originally, or the add cleared it in the temporary value. As a result, one of the two values anded together must have the bit cleared.

Differential Revision: http://reviews.llvm.org/D14315

llvm-svn: 252629
2015-11-10 18:46:14 +00:00
Teresa Johnson
dfbebc37da [ThinLTO] Update comment per change in WeakAny handling (NFC)
llvm-svn: 252627
2015-11-10 18:26:31 +00:00
Teresa Johnson
3cd8161c9b [ThinLTO] WeakAny fixes/cleanup
Ensure WeakAny variables are imported as ExternalWeak declarations. To
handle WeakAny more consistently and fix this issue:

1) Update helper doImportAsDefinition to properly flag WeakAny variables
   and aliases as not importing defintions.

   Update callers of doImportAsDefinition to remove now redundant checks for
   WeakAny aliases, or ignore aliases, as appropriate.

2) Add any !doImportAsDefinition GVs to DoNotLinkFromSource set during
   linking of the GV prototype, where we usually add GVs to the
   DoNotLinkFromSource set for other reasons.

   Remove now unnecessary adding of WeakAny aliases to
   DoNotLinkFromSource set from copyGlobalAliasProto.

   Remove now unnecessary guard against linking non-imported function
   bodies from ModuleLinker::run.

llvm-svn: 252626
2015-11-10 18:20:11 +00:00
Sanjay Patel
241c31fb64 [AArch64] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any AArch64
implementation.

The net result of allowing this speculation for the regression tests in this
patch is that we get this code:

ctlz:
  clz  w0, w0
  ret

cttz:
  rbit  w8, w0
  clz  w0, w8
  ret

Instead of:

ctlz:
  cbz  w0, .LBB0_2
  clz  w0, w0
  ret
.LBB0_2:
  orr  w0, wzr, #0x20
  ret

cttz:
  cbz  w0, .LBB1_2
  rbit  w8, w0
  clz  w0, w8
  ret
.LBB1_2:
  orr  w0, wzr, #0x20
  ret

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14505

llvm-svn: 252625
2015-11-10 18:11:37 +00:00
Renato Golin
0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Michael Kuperstein
a01a5ee72f [X86] Do not try to custom-lower sitofp/fptosi in soft-float mode
Differential Revision: http://reviews.llvm.org/D14495

llvm-svn: 252621
2015-11-10 17:37:49 +00:00
Xinliang David Li
6021b75a1f Fix asan warning (NFC)
llvm-svn: 252617
2015-11-10 17:11:33 +00:00
Sanjay Patel
766589efdc add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html

The spilling problem is independent and not addressed by this patch.

The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. 
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.

The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)

Differential Revision: http://reviews.llvm.org/D13417

llvm-svn: 252616
2015-11-10 16:48:53 +00:00
James Molloy
9d55f19cfa Reapply "[ARM] Combine CMOV into BFI where possible"
Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh!

Original commit message:

If we have a CMOV, OR and AND combination such as:
  if (x & CN)
      y |= CM;

And:
  * CN is a single bit;
    * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

llvm-svn: 252606
2015-11-10 14:22:05 +00:00
Igor Laevsky
01c3692a10 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Tilmann Scheller
990a8d88c8 [PowerPC] Remove redundant code.
The local variable Hi is never being read.

Issue identified by the Clang static analyzer.

llvm-svn: 252600
2015-11-10 12:29:37 +00:00
Oliver Stannard
d414c99b9c [AArch64] Fix halfword load merging for big-endian targets
For big-endian targets, when we merge two halfword loads into a word load, the
order of the halfwords in the loaded value is reversed compared to
little-endian, so the load-store optimiser needs to swap the destination
registers.

This does not affect merging of two word loads, as we use ldp, which treats the
memory as two separate 32-bit words.

llvm-svn: 252597
2015-11-10 11:04:18 +00:00