Limitations
Understanding what Devfolio Analyzer cannot do is as important as knowing what it can. This page documents known constraints, data gaps, and scoring edge cases.
GitHub API constraints
Contribution calendar is not available via API
The contribution calendar (the green squares on a GitHub profile) is not exposed by GitHub's REST or GraphQL API for users other than the authenticated user. Devfolio Analyzer scrapes this data from the rendered HTML at github.com/{username}.
Implications:
- This scrape can break if GitHub changes their HTML structure (has happened twice in 2023–2024)
- The scrape adds ~800ms to analysis time
- It fails silently for users with private contribution settings — contribution metrics default to
null
Workaround: If you're analyzing your own profile and need accurate contribution data, authenticate via OAuth (coming in a future release) to use the authenticated API endpoint.
Private repositories are invisible
All analysis is based on public repository data only. Developers who do most of their work in private repos — common for professionals — will score poorly on project quality and contribution consistency regardless of their actual output.
There is no way to fix this without OAuth authentication and explicit user consent. This is by design.
Forked repos inflate repo counts
GitHub includes forked repositories in the public repo count and the default repo listing. Devfolio Analyzer excludes forks from analysis by default (include_forks: false), but the raw public_repos count in the profile object reflects GitHub's number, which includes forks.
Organization contributions are partially visible
Contributions made to organization repositories show up in the contribution calendar but not in the per-repo analysis unless the organization repos are also public. Some developers have significant work in organization-owned repos that won't surface in this analysis.
LLM scoring limitations
Scores vary slightly across runs
The LLM is called with temperature 0.2, which reduces but does not eliminate variance. Two analyses of the same profile within the same cache window will return identical results (cached). After cache expiry, small score differences (±3–5 points) are possible on re-analysis.
This is not a bug — it's an inherent property of probabilistic language models. The 6-hour cache mitigates this for most use cases.
Generic profiles are scored conservatively
The LLM is instructed to be specific and reference actual repo names. For profiles with minimal data (no descriptions, no READMEs, no bio), the feedback will be generic because there is nothing specific to reference. The score will be low — which is accurate.
The model doesn't execute code
The LLM cannot evaluate code quality, architecture decisions, or algorithmic complexity. "Has tests" is detected by file presence, not by whether the tests are meaningful. A repo with a single empty test file will incorrectly get credit for test presence.
Non-English READMEs
README quality evaluation is done by the LLM, which works best with English content. READMEs in other languages may be evaluated less accurately. This is a known gap with no current mitigation.
Scoring model limitations
Stars are not a quality signal, but they're used anyway
Repository star count is one input to the repo ranking algorithm (top 6 repos selected for detailed analysis). A popular but low-quality repo will be selected over a high-quality but obscure one. This is a reasonable heuristic but not a perfect one.
Quantity is deliberately penalized
The scoring model does not reward having more repos. 20 well-maintained repos score the same or lower than 5 excellent ones. Developers with large numbers of small utility scripts, course projects, or abandoned experiments may find their scores lower than expected.
Account age is not scored
Devfolio Analyzer does not factor account age into scores. A 6-month-old account with strong signals can score higher than a 10-year-old account with stale activity. This is intentional — we score current signal strength, not tenure.
Known edge cases
What this tool is not
- Not a hiring decision tool. The score is a signal, not a verdict. Use it as a starting point for evaluation, not a filter.
- Not a real-time dashboard. Results are cached for 6 hours. Do not expect second-by-second accuracy.
- Not an audit tool for private code. Everything is based on public GitHub data.
- Not a replacement for code review. This tool cannot assess the actual quality of code — only the signals around it.