CONFIDENTIAL PROPOSAL — APRIL 2026
Predictors of Completion in Digital Weight Management
A proposal for collaborative research — from Dominik to Samandika
The idea in one sentence: I built and operated a digital behavioral weight management program in the DACH region. I have the full operational archive — structured intake data, a manualized coaching algorithm, and deep longitudinal communication logs for each participant. I want to publish a retrospective analysis of what predicted who completed the program and who dropped out.
Why this matters
Everyone builds digital health interventions. Almost nobody publishes real-world data on what actually predicts completion vs. dropout in these programs.
The literature on digital therapeutics adherence predictors is thin, especially from practitioner-run programs (as opposed to controlled trials). What makes this dataset unusual is not its size — it's the depth per participant: timestamped coaching interactions, behavioral adaptation decisions, engagement trajectories, and outcome documentation, all from a real commercial program.
What I actually have
These are the raw assets sitting in my archive, ready for extraction and analysis:
1
Structured intake data
Tally Forms onboarding: format (group/individual/hybrid), duration, coach observations, client expectations ("What would make this 10/10?")
2
The algorithm document
100+ page manualized intervention protocol. Session sequencing, behavioral adaptation triggers, decision trees, 3-phase ghosting protocol.
3
Client communication logs
Slack archive (Nov 2022 – Mar 2026). ~70 individual client channels with timestamped coach-client interactions.
4
Master client sheet + sub-sheets
Google Sheets tracking system with overview of all participants and individual sub-sheets per client (status, progress, notes). Source of truth for the cohort.
5
Telegram archive
115 chat groups, 2,238 contacts, main group with 2,266 messages. Partially exported.
6
Payment behavior data
Invoicing tracked in Slack. Payment delays, reminders, collection patterns — an interesting early predictor of dropout.
7
Personalized plans
Individualized client roadmaps (Canva-designed) showing tailored intervention sequencing.
8
CSAT scores
Predictive satisfaction scoring system (implemented Feb 2025). Early satisfaction as a process predictor.
9
Content library + transcripts
Full transcript archive of all educational videos consumed during the program. Analyzable for engagement patterns.
10
PhD theoretical framework
Doctoral thesis (magna cum laude, U Hamburg 2018) comparing 6 behavior change frameworks: TPB, TTM, SCT, BE, FBM, SDT.
What the paper would look like
Title options
- "Baseline and Behavioral Predictors of Program Completion in a Commercial Digital Weight Management Intervention: A Retrospective Case Series"
- "Who Completes a Digital Behavioral Weight Loss Program? Identifying Predictors from Real-World Practice Data"
Design
Retrospective case series with deep behavioral data. We would focus on a sub-cohort with the richest documentation — participants for whom we have complete intake data, longitudinal communication logs, and documented outcomes. Quality over quantity: the value is in the depth of behavioral signals per participant, not in large N.
Baseline predictor variables
| Variable | Source | Type |
| Program format (group / individual / hybrid) | Tally intake | Categorical |
| Program duration selected | Tally intake | Continuous |
| Client expectations (coded from free text) | Tally intake | Categorical |
| Payment behavior (on-time / delayed / reminded) | Slack invoicing | Categorical |
| Coach observations at intake (coded) | Tally notes | Categorical |
Early behavioral predictor variables
| Variable | Source | Type |
| Response latency to first coach message | Slack timestamps | Continuous |
| Message frequency in week 1–2 | Slack timestamps | Continuous |
| Engagement decay rate (messages/week over time) | Slack timestamps | Continuous |
| Ghosting onset (days until first silence >7 days) | Slack timestamps | Continuous |
| CSAT score (where available) | Slack CSAT | Continuous |
| Contact channel preference (call vs. text) | Slack notes | Categorical |
Analysis plan
- Descriptive statistics for completers vs. non-completers
- Univariate comparisons (chi-square, t-test / Mann-Whitney)
- Logistic regression for multivariate prediction model
- Possibly: survival analysis (time to dropout) with Cox regression
- Sensitivity analysis: different definitions of "completion"
Limitations (upfront)
- Retrospective, single-arm, single-center
- Small N (deep case series, not large-sample epidemiology)
- No standardized outcome measures (weight data from screenshots, not structured)
- Self-selected commercial participants (not representative of general population)
- German-speaking DACH population only
- Coach effects not controlled (multiple coaches over the program period)
These are real but standard for practitioner case series. The value is not in statistical power — it's in the depth of behavioral signal extraction per participant and the replicable methodology.
Why this works
- Novel angle: Most DTx adherence papers come from controlled trials. This is operational data from a real commercial program — messy but authentic.
- Depth over breadth: Small N but unusually rich longitudinal behavioral data per participant. Payment behavior, response latency, ghosting patterns — these are exactly the "digital behavioral biomarkers" the field talks about but rarely demonstrates from practice.
- Theory-grounded: My PhD provides the theoretical backbone (6 frameworks). This isn't atheoretical data mining — it's theory-informed predictor selection.
- Scalable methodology: The messaging-based signal extraction approach is replicable by any digital health program. The method is the contribution, not just the findings.
- Lifestyle medicine relevance: Weight management is core lifestyle medicine. APJLM scope. Your network.
Where to publish
| Journal | Fit | Timeline | Impact |
| medRxiv | Speed | Upload in days | Immediate DOI |
| JMIR Formative Research | Digital health, practice reports | 6–10 weeks | IF ~3.0 |
| BMJ Open | Open access, broad | 8–12 weeks | IF ~2.5 |
| Frontiers in Digital Health | Strong fit, fast review | 6–8 weeks | IF ~3.2 |
| APJLM | Our journal | Fast track? | Building IF |
| Lifestyle Medicine (Wiley) | WLMO connection | Standard | Niche |
Recommendation: medRxiv first (speed + DOI), then submit to JMIR Formative Research or Frontiers in Digital Health.
Author roles
| Contribution | Who |
| Conceptualization | DD + SS |
| Data provision & anonymization | DD |
| Data analysis & statistical modeling | SS |
| Methodology & study design | SS |
| Program design & theoretical framework | DD |
| Writing — methods & results | DD + SS |
| Writing — discussion & implications | DD + SS |
| Lifestyle medicine framing | SS |
| Critical revision | SS |
Author line: Dotzauer D, Saparamadu S. (open to discuss order)
Role clarity: DD designed and operated the program (health technology founder). SS brings the clinical and epidemiological lens (physician + MPH) to analyze the data independently.
Ethics
- No ethics approval needed for retrospective analysis of anonymized operational data (quality improvement framework)
- All client data fully anonymized
- "Data for research purposes" clause was being added to contracts (Feb 2025)
The bigger picture
This paper is Paper #1 in a series. One dataset, multiple publications:
Paper 1: Predictors of completion
This proposal. Retrospective, descriptive. The entry point.
Paper 2: The intervention algorithm
Methods/protocol paper. The 100-page coaching algorithm, formalized for publication.
Paper 3: Behavioral phenotype taxonomy
Clustering clients into response profiles using unsupervised ML on the same data.
Paper 4: Theory comparison applied
Mapping which behavior change framework best explains the observed predictors.
All four build toward my Shenzhen research agenda: AI-driven behavioral phenotyping for chronic disease intervention.
What I need from you
- Sanity check: Does this design hold up? Am I missing obvious methodological problems?
- Are you in? If yes, I start data extraction this week.
- Analysis guidance: Your MPH training would help with the regression modeling.
- Timeline: medRxiv preprint within 2–3 weeks. Realistic?