name: jde-replication-and-data-policy description: Use when assembling the data and code deposit for a Journal of Development Economics (JDE) submission or accepted paper — JDE's mandatory replication policy, Mendeley Data hosting, and the fact that data/code can be requested at the review stage. Builds the package; it does not run the analysis.
Replication & Data Policy (jde-replication-and-data-policy)
When to trigger
- A paper is heading toward submission or acceptance and the replication package is not built
- You are unsure what JDE requires and when
- Some data are proprietary/restricted (government microdata, partner-NGO data) and you need a compliant path
- You want to pre-empt a referee asking for data and code during review
JDE's policy (what it actually requires)
JDE operates a mandatory replication policy. The journal will, in general, publish a paper only if the data used are clearly and precisely documented and readily available to any researcher for replication. Authors of accepted papers containing empirical, simulation, or experimental work must provide, prior to publication, the data, programs, and other computational details sufficient to permit replication; these are posted on the JDE website alongside the article. JDE partners with Mendeley Data to host research data, displayed next to the article on ScienceDirect.
Crucially, this is not only an acceptance-stage formality: editors or referees may request the data, programs, and computational details at the review stage. So the package must be submission-ready, not assembled after acceptance.
Build the package as you go
- One master script (
run_all) that regenerates every table and figure from raw or constructed data, in order. - A README documenting data provenance, every construction/cleaning step, software and package versions, run time, and which script produces which exhibit.
- Pinned environment: record Stata
ssc/netversions,renv.lock, orrequirements.txt; set and report all random seeds (bootstrap, randomization inference, simulation). - Self-contained: relative paths, no machine-specific dependencies, no orphaned hand edits to outputs.
Restricted or proprietary data
JDE's standard is that data be available for replication, but development work often uses restricted government surveys or partner-collected data that cannot be redistributed. In that case:
- Provide all programs and a precise, documented access path so another researcher could obtain the data and rerun the code.
- Disclose the restriction clearly (cover letter and README) rather than silently omitting data.
- Where possible, deposit a de-identified extract or simulated data that exercises the code.
Pre-results review note
For pre-results-reviewed papers, the pre-specified research plan (hypotheses, statistical analysis plan, power analysis) is included in the supplementary online appendix, and the published article carries a footnote noting the pre-results submission — the analysis plan is itself part of the replication record.
Data-status routing table
| Data in the paper | What the JDE package must contain |
|---|---|
| Author-collected RCT survey data | De-identified microdata + master script + survey instruments |
| Government microdata under DUA | All programs + a documented access path + a synthetic test extract |
| Partner-NGO administrative records | Programs + access contact + the construction code, data withheld |
| Public secondary data (DHS, World Bank) | Download script or pinned snapshot + every cleaning step |
Worked micro-example (illustrative)
Hypothetical: a cluster-randomized microfinance experiment whose endline consumption data are de-identifiable but whose credit-bureau linkage is restricted.
- Deposit shape:
run_all.doregenerates all 6 tables and 4 figures in order;/rawholds the de-identified survey panel;/restricted_README.mdgives the bureau access path and the contact; seeds for the randomization-inference p-values are set and printed. - Disclosure line (cover letter + README): "Endline survey data deposited de-identified on Mendeley Data; the bureau linkage is under a data-use agreement and cannot be redistributed — access path and code provided so the step is reproducible." File counts are illustrative.
Referee / editor pushback at the review stage
- "Please share the data and code so I can check the attrition handling." → JDE referees can ask mid-review; the package must already be staged, not promised for acceptance.
- "Your Lee-bounds table will not reproduce — which script makes it?" → the README must map each exhibit to its script; an unmapped table reads as a reproducibility risk.
- "You cite restricted data but share nothing." → withholding restricted data is fine; withholding the programs and access path is not, and is a common avoidable rejection trigger here.
Anti-patterns
- Treating replication as an acceptance-stage chore — JDE can request it during review
- A code dump with no README and no master script
- Hand-edited tables that the code does not reproduce
- Claiming proprietary data as a reason to share neither programs nor an access path
- Unset seeds, so bootstrap or randomization-inference results shift on rerun
Output format
【Master script regenerates all exhibits?】[Y/N]
【README】provenance + steps + versions + seeds? [Y/N]
【Environment pinned】[Stata/R/Python versions]
【Data status】public / restricted (+ access path)
【Mendeley Data deposit prepared?】[Y/N]
【Pre-results plan in appendix?】[Y/N/NA]
【Next step】jde-review-process