#50 - The Best Astronauts for the Job Just Happened to Be Men
Team performance science tells a different story.
Last week, NASA announced its crew for Artemis III. There was immediate backlash: after the previous mission sent the first woman to the Moon, Christina Koch, there were no women selected for this crew.

NASA Administrator Jared Isaacman said those selected are “the best astronauts to undertake and complete the mission's objectives”. With a corps of 37 astronauts, 15 of whom are women (41%), it’s hard to agree with that logic. Part of the argument, that this is a mission testing docking procedures and so needs skilled pilots on the crew, doesn't stand up to scrutiny. The crew NASA actually selected isn't built exclusively around piloting expertise: Bresnik commands, Parmitano pilots, and Douglas and Rubio fly as mission specialists. And even if it were, there are four women in the corps with the credentials to fly this mission: test pilots Jasmin Moghbeli, Nicole Aunapu Mann, Jessica Wittner, and flight instructor Nichole Ayers. Put simply, politics interfered in this selection. The current US Administration’s allergy to DEI initiatives is well documented, including rolling back a pledge on landing the first woman on the Moon.
Artemis III is the most operationally exposed of the Moon missions currently planned. Whilst Artemis II sent the space capsule around the far side of the Moon, and Artemis IV will land humans on its surface for the first time in 50 years, Artemis III is essentially an equipment test that won't break out of Low Earth Orbit (LEO). An essential step for testing the Human Landing System developed by SpaceX and Blue Origin, this mission is all about operational risk reduction. And that context matters for who gets selected.
What the research says about team composition and performance
I’ve been trawling through the research on the gender composition of teams, and the impact of that on team performance, as part of preparing for my TEDx (which is less than two weeks away now!). Studies from the late 2000s through to last year show findings that look mixed at first glance, but a clear pattern emerges once you account for context. Let me take you through it now.
Joshi & Roh (2009) — The Role of Context in Work Team Diversity Research: A Meta-Analytic Review (Academy of Management Journal)
A meta-analysis of 39 field studies, covering 8,757 teams, which examined how diversity (including gender) relates to team performance — and crucially, how the context changes that relationship. The analysis found that gender (and other relations-oriented diversity like race and age) had a tiny negative effect. However, this meta result is masking large variation within individual studies. Once you account for context, the effect sizes double or triple.
They also found a temporal dimension: short term teams benefit from diversity, whilst long term teams suffer from it.
So the question ‘is gender diversity good or bad for performance?’ only makes sense when you specify the team context.
Woolley, Chabris, Pentland, Hashmi & Malone (2010) — Evidence for a Collective Intelligence Factor in the Performance of Human Groups (Science)
Across two studies with 699 people in small groups, a single statistical factor — collective intelligence — predicts how well a group performs across a wide variety of tasks, much like general intelligence does for individuals. Collective intelligence was not strongly predicted by the average or maximum IQ of its members. Putting smart individuals together does not automatically produce a smart group.
Instead, three factors predicted collective intelligence: average social sensitivity of members, equality in conversational turn-taking, and the proportion of women in the group. The gender effect was largely mediated by social sensitivity, i.e. women in the sample scored higher on social sensitivity than men.
However, this lab-based study did not engage with the contextual moderators Joshi & Roh had flagged a year earlier.
Curșeu, Pluut, Boroș & Meslec (2014/2015) — The Magic of Collective Emotional Intelligence in Learning Groups (British Journal of Psychology)
Here the researchers studied 100 student learning groups (528 students) at two time points four weeks apart. They found that the higher the percentage of women in a group, the more collective emotional intelligence the group developed. Collective emotional intelligence in turn increased group cohesion, reduced relationship conflict, and improved group effectiveness.
This builds directly on Woolley et al’s work to show that women, on average, bring higher social sensitivity and more communal traits, which seed emotionally intelligent group norms.
Bell & Outland (2017) — Team Composition Over Time (Research on Managing Groups and Teams)
This is a chapter in a book rather than a paper, so synthesises previous research findings on team composition and how that shapes dynamics across a team’s lifecycle.
The authors state that surface-level composition variables like gender tend to have fleeting effects: their influence often fades as team members get to know each other and deeper-level attributes take over. However, they also flag that Curșeu et al. is a notable exception where gender composition had durable effects via collective emotional intelligence.
Read this in the context that teams are increasingly dynamic — membership churns, boundaries are fluid, people belong to multiple teams at once — which complicates any snapshot view of composition.
Kearney, Gebert & Voelpel (2022) — Gender Diversity and Team Performance Under Time Pressure (Journal of Organizational Behavior)
This study directly tests how gender diversity interacts with time pressure — particularly relevant for high-stakes operational teams. The researchers argue that gender differences in moral reasoning could increase the chances that different viewpoints and interests are considered when completing the task, and gender differences in risk preferences could help ensure that teams strike a sensible balance between taking too much and too little risk.
Interestingly, the study used complex tasks, like hidden-profile and conjunctive tasks, where contributions from all team members are required and information must be elaborated rather than just aggregated. This task structure resembles mission-critical operations where distributed expertise has to be integrated under pressure.
The proposed mechanism is that gender-diverse teams under time pressure are less likely to withdraw and more likely to engage in information elaboration, leading to better outcomes on complex tasks. However, the authors flag a limitation: their research was conducted with newly formed student teams, leaving open how team tenure and trust interact with these effects in real-world high-stakes settings.
Enache, Scarlat, Charalambous, Duboisée de Ricquebourg & Shields (2025) — Gender Diversity Helps Teams Maintain Integrity Under Pressure (Harvard Business Review)
New research found that simply having more women on the team tilted behaviour toward greater honesty in financial analysis, acting as a form of soft regulation that arises organically from within the team, complementing formal compliance and external oversight.
The mechanism described is resistance to optimism bias and conflict-of-interest distortions under pressure — gender-diverse teams were more likely to flag risk honestly rather than go along with rosy projections. This finding is relevant to any high-stakes setting where there’s pressure to conform to an optimistic operational narrative (like wanting the docking procedure to be a success).
The variables that influence gender’s effect on team performance
Taken together, these five studies show a number of variables that influence whether the gender of a team has an impact on their performance:
Time pressure
Task complexity
Team tenure (how long they have been formed)
Industry status hierarchies
Whether the setting itself is male-dominated
Astronaut crews sit at the intersection of all of these variables:
They train together as a team for years, before going on a mission that may only last weeks: this intensity to duration ratio is unusual, and the previous findings may not easily extrapolate for the different phases an astronaut crew goes through.
Astronauts train not just to pilot rockets, landers and space capsules, but also in geology, photography, emergency medicine, equipment maintenance and repairs (Christina Koch famously was the crew’s plumber when the space toilet wasn’t working!) — and more.
Space is notoriously gender unequal: according to UNOOSA, women make up around 25-30% of the space workforce globally, and just 11% of all astronauts to date.
The short of it is this: gender diversity hasn’t been extensively studied by NASA. In Apollo it wasn’t a factor as all the astronauts were men, and in the space shuttle and ISS era it wasn’t systematically studied.
So to bring it back to what Isaacman said: the best people for this mission have been selected — and they just so happened to be men. Unfortunately, we’re lacking the evidence to support that claim, because nobody has done the research on this specific context. We have decades of adjacent evidence suggesting that gender-diverse teams perform better under exactly the conditions Artemis III will face.
We also have the closest real-world analogue we're going to get to a space mission: Antarctic expeditions. Long-duration, high-stakes, small isolated teams, distributed expertise across a wide skill set, intense training-to-mission ratio. The all-female teams that have crossed the continent, the Ice Maidens among them, show that women perform at the highest level in exactly the kind of extreme environment Artemis III is rehearsing for.
Four men were selected for Artemis III. The claim that they are 'the best' is a claim NASA hasn't done the work to make.


