Note that these conditions are offered as sufficient but not necessary.
(Bratman originally claimed that they were necessary and sufficient, but
nothing in the construction rules out alternative realisations of the functional
characterisation of shared intention.)
As it stands, then, this objection does not establish much.  It concerns conditions 
imposed by the substantial account of shared intention which are sufficient but not 
necessary conditions.
The substantial account is supposed to characterise one—perhaps one among many—ways 
in which the functional role of shared intentions can be realised.  So the objection 
serves only to raise a question. 
**Are there in fact alternative sufficient conditions 
for shared intention, conditions that can be met without already having abilities to use 
psychological concepts whose development was supposed to be explained by joint action?**
The answer to this question is not entirely straightforward.  We must begin with the 
functional roles of shared intention,  for these provide necessary conditions.  One of 
the roles of shared intentions is to coordinate planning.  What does coordinating planning 
involve?  Intuitively the idea is that just as individual intentions serve to coordinate an 
individual’s planning over time, so shared intentions coordinate planning between agents. 
(I use the terms ‘individual intention’ and ‘individual goal’ to refer to intentions and 
goals explanatory of individual actions; an ‘individual action’ is an action performed by 
just one agent such as that described by the sentence ‘Ayesha repaired the puncture all 
by herself’.)  A second role for shared intentions is to structure bargaining concerning 
plans.  To understand these roles it is essential to understand what ‘planning’ means in 
this context.  The term ‘planning’ is sometimes used quite broadly to encompass processes 
involved in low-level control over the execution of sequences of movements, as is often 
required for manipulating objects manually, as well as processes
controlling the movements of a limb on a single trajectory.  In Bratman’s account and this paper, the term ‘planning’ is used in a narrower sense.  Planning in this narrow sense exists to coordinate an agent’s various activities over relatively long intervals of time; it involves practical reasoning and forming intentions which may themselves require further planning, generating a hierachy of plans and subplans.  Paradigm cases include planning a birthday party or planning to move house. 
Given the functional roles of shared intention, when (if ever) must the states which realise shared intentions include intentions about others’ intentions?  Coordinating plans with others does not seem always or in principle to require specific intentions about others’ intentions.  It is plausible that in everyday life some of our plans are coordinated largely thanks to a background of shared preferences, habits and conventions.  Consider, for example, people who often meet in a set place at a fixed time of day to discuss research over lunch.  These people can coordinate their lunch plans merely by setting a date and following established routine; providing nothing unexpected happens, they seem not to need intentions about each other’s intentions.  Within limits, then, coordinating plans may not always require intentions about intentions.  The same may hold for structuring bargaining.  But when the background of shared preferences, habits and conventions is not sufficient to ensure that our plans will be coordinated, it is necessary to monitor or manipulate others’ plans.  And since intentions are the basic elements of plans (in the special sense of ‘plan’ in terms of which  Bratman defined shared intention), this means monitoring or manipulating others’ intentions.  The background which makes for effortlessly coordinated planning is absent when our aims are sufficiently novel, when the circumstances sufficiently unusual (as in many emergencies), and when our co-actors are sufficiently unfamiliar.  In all of these cases, coordinating plans and structuring bargaining will involve monitoring or manipulating others’ intentions.  Now this does not necessarily involve forming intentions about their intentions because, in principle, monitoring and manipulating others’ intentions could (within limits) be achieved by representing states which serve as proxies for intentions rather than by representing intentions as such, much as one can (within limits) monitor and manipulate others’ visual perceptions by representing their lines of sight.  But possession of general abilities to monitor and manipulate others’ intentions does require being able to form intentions about others’ intentions.
The question was whether there are sufficient conditions for shared intention which do not presuppose abilities to use psychological concepts whose development is supposed to be explained by joint action.  As promised, the answer is not straightforward.  In a limited range of cases, coordinating plans and perhaps structuring bargaining does not appear to require insights into other minds.  But in other cases, particularly cases involving novel aims or agents unfamiliar with each other, intentions about others’ intentions are generally required. 
The main question for this section was whether Bratman’s account captures a notion of joint action suitable for explaining the early development of children’s abilities to think about minds.  Some of the joint actions which young children engage in involve novel aims, and some involve unfamiliar partners.  So if these joint actions did involve coordinating planning and structuring bargaining, they could not rest on a shared background but would require abilities to form intentions about others’ intentions.  It follows that joint action would presuppose much of the sophistication in the use of psychological concepts whose development it was supposed to explain.  So given the premise that joint action plays a role in explaining early developments in understanding minds, it cannot be the case that the joint actions children engage in as soon as they engage in any joint actions involve shared intentions as characterised by Bratman.