Sound-shape associations (e.g., preferentially matching angular shapes with high-pitched sounds and smooth shapes with low-pitched ones) have been almost universally observed in humans. If cross-modally congruent sounds and shapes are more robustly integrated in humans, distinguishing them in time might be hypothetically more challenging compared to incongruent sound-shape pairings. Supporting this premise, a highly cited work by Parise and Spence (2009; n = 12) reported worse temporal order judgement performance for audiovisual stimuli with congruent compared to incongruent sound-shape associations. Here, we report the results of five experiments across two laboratories, including a preregistered replication attempt, all (∑n = 102) failing to replicate the original results. Additionally, frequentist and Bayesian meta-analyses found no evidence against the null hypothesis, revealing a negligible effect size. The combined results indicate that multisensory temporal resolution in humans is unaffected by sound-shape associations, which might arise at a later (or parallel) processing stage compared to cross-modal temporal order judgements. (PsycInfo Database Record (c) 2024 APA, all rights reserved).