fix(linkedin): three independent adapter bugs (profile-experience timeout, connect false-positive, thread-snapshot leak)#1913
Open
archits01 wants to merge 6 commits into
Conversation
profile-experience.js and profile-projects.js passed `timeout: 10000`
to page.wait({text/selector, timeout}) assuming milliseconds. The
page.wait() implementation in base-page.js treats `options.timeout` as
seconds and multiplies by 1000, so the adapters were actually waiting
10,000,000ms (~2.8 hours) instead of 10s.
In practice the command-level 60s timeout fires first, so
profile-experience always failed with "linkedin/profile-experience
timed out after 60s". profile-projects happened to work because
"Projects" text rendered fast enough that the wait resolved before the
60s cap — but the bug was still present.
Fix: change `timeout: 10000` → `timeout: 10` (4 occurrences across the
two files), bringing them in line with how search.js uses the same API
(`timeout: 8`, `timeout: 10`).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nect button is present buildProfileProbeScript scans ALL visible buttons on the profile page for "Message" text (line 143) and ALL connect anchors (line 145), then the safety logic checks alreadyConnected BEFORE connectAvailable. On LinkedIn profiles, a stray "Message" element in the sidebar, "People also viewed" cards, or premium-tier overlays trips the alreadyConnected flag even when the main Connect button is visible. Result: `connect` reports `status: not_connectable, reason: already_connected` for profiles you've never connected with. Repro: any profile of someone you're not connected to who has any sidebar widget rendering a "Message" button. Empirically reproduced on multiple unconnected profiles (Marie Curie, Sundar Pichai namesakes). Fix: gate the alreadyConnected check on !connectAvailable. If the profile clearly has a Connect button, prefer that signal. The Connect button is the authoritative tell — if it's there, you can connect; if it's not AND there's a Message button, then you're connected. Minimal-diff approach: single line change at the safety check. Alternative cleaner fix would restructure the probe to scope the button scan to the top profile card, but this matches the existing code shape with smallest risk surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…debar leak The thread-snapshot adapter exposed two leaks of unrelated content via the `snapshot_json` field: 1. `bodyText: refreshedText` (return object) dumped the entire `document.body.innerText`, including the messaging sidebar's conversation list with previews of other people's threads, "Sponsored" InMails, and the navigation chrome. Anyone consuming snapshot_json got every conversation preview the user could see. 2. `headerSelectors` collected names from 9 selectors including unscoped `main h1`, `main h2`, `[data-anonymize="person-name"]`, `a[href*="/in/"]` and `.msg-conversation-card__participant-names` — all of which match sidebar conversation cards. Result: `headerNames` returned every contact in the user's recent inbox. Empirical repro on a real thread: `headerNames` contained 10 names including the active thread participant AND 8 unrelated sidebar contacts. The internal `bodyText` was ~5KB of unrelated conversation previews + nav chrome. Fix: - Remove `bodyText: refreshedText,` from the return object. The internal `refreshedText` variable is still used to compute the `latestMessageText` fallback inside the IIFE — that logic is unchanged. Only the externally-exposed output drops it. - Reduce `headerSelectors` to 3 thread-scoped selectors: `.msg-thread__link-to-profile`, `.msg-thread__link-to-profile span[aria-hidden="true"]`, and `.msg-thread .msg-entity-lockup__entity-title` (the .msg-thread prefix gates it to the active conversation only). Recipient extraction (consumer reads headerNames[0]) continues to work because `.msg-thread__link-to-profile` is the canonical thread header link. After the fix: `headerNames` contains only the active thread participant. `messages` array is unchanged (its selectors were already scoped to `.msg-s-message-list__event` etc.). No test changes — the existing thread-snapshot.test.js does not assert on bodyText or on unrelated headerNames. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The cards selector at line 26 scans for `li, div, section, article` elements whose text contains "withdraw". This matches not only the real invitation cards but also their wrapping section element — the wrapper's innerText includes the page header counter (e.g. "People (81)") at the top of the list. The fallback name-extraction at line 39 (lines.find skipping a few common keywords) then picks "People (81)" as the row's name. Result: rank 1 of sent-invitations output is a phantom row with `name: "People (81)"` and the profile_url of whatever real invite happens to be the first descendant link. Empirical repro: rank 1 returned `name: "People (81)", profile_url: <real-marie-curie-url>`, and rank 2 returned the real `name: "Marie Curie", profile_url: <same-url>` — the actual invitation card. Fix: drop rows whose parsed name matches header-counter shapes: `/^people\s*\(\d+\)$/i` and `/^\d+\s+invitations?$/i`. Minimal diff (3 lines), keeps the rest of the parsing path intact. A cleaner fix would scope the cards selector to a known invitation-row class instead of `li, div, section, article` — but the header rows already provide a stable signature, and this matches the existing defensive style elsewhere in the script. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…olution needed)
LinkedIn's standard people-search page (/search/results/people/) accepts
several filter knobs as URL query parameters whose values are
JSON-stringified. Adapter previously only built `?keywords=<encoded>`,
matching opencli's existing CLI surface of "keywords + limit". The
underlying LinkedIn UI exposes far more.
This commit adds 7 new flags for the filters that don't require any
ID/URN resolution against LinkedIn's typeahead — they accept free text
or a small enum and get passed straight through:
--first-name <text> -> firstName="<text>"
--last-name <text> -> lastName="<text>"
--title <text> -> title="<text>"
--school-keyword <text> -> schoolFreetext="<text>"
--network 1,2,3 -> network=["F","S","O"]
(1=1st, 2=2nd, 3=3rd+; CSV maps to F/S/O)
--profile-language en,fr -> profileLanguage=["en","fr"]
(ISO 639-1 2-letter codes, validated)
--open-to proBono,boardMember -> serviceCategory=["proBono","boardMember"]
(validated against LinkedIn's enum)
`buildSearchUrl` now takes an options object instead of a bare keywords
string. The function preserves byte-for-byte compatibility for the
zero-filter case (still emits `?keywords=<encoded>`), and accepts a
bare string as input for back-compat with the existing test (which
calls `buildSearchUrl('site reliability engineer')` directly).
Filter values get JSON-stringified then percent-encoded — matching how
LinkedIn's UI re-canonicalises the URL when you click filter chips.
Empirically verified (2026-06-10): the page accepts these param names
on its server-rendered search results route, applies the filters, and
shows the same `1st 2nd 3rd+` chips it would for a UI-driven filter.
A follow-up commit will add ID-resolution filters (current-company,
past-company, industry, location, school by name) by calling LinkedIn's
typeahead Voyager endpoint.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…r yet) Adds direct ID-acceptance for LinkedIn's structured filters that normally require a typeahead lookup to resolve a name to a URN/ID: --current-company-id <id1,id2,...> -> currentCompany=["<id>",...] --past-company-id <id1,id2,...> -> pastCompany=["<id>",...] --industry-id <id1,id2,...> -> industry=["<id>",...] --location-id <id1,id2,...> -> geoUrn=["<id>",...] --school-id <id1,id2,...> -> schoolFilter=["<id>",...] Empirically verified (2026-06-10) that LinkedIn's /search/results/people/ route accepts these param values as JSON-stringified arrays of bare numeric IDs — the full `urn:li:fs_<type>:<id>` prefix is not required. Confirmed by: ?currentCompany=["1441"] -> filtered to Google employees ?geoUrn=["105214831"] -> filtered to Bengaluru-located profiles Each flag accepts either a bare numeric ID or a full URN; URNs are stripped to their numeric tail before sending. This is a "v2-light" implementation. A future commit could add a typeahead-based name resolver (`--current-company "Google"` -> resolve to "1441"), but LinkedIn's classic Voyager typeahead endpoint (`/voyager/api/typeahead/hits`) was removed/moved in 2024 to a GraphQL endpoint with rotating queryId hashes. Implementing that cleanly is a separate, larger lift; the ID-direct flags ship now so the filter surface is usable today. To find an ID: visit the relevant page in LinkedIn UI (e.g. a company's `/company/<slug>/` page), open the network panel, look for the `fsd_company` URN in the response; or apply the filter once via the search results UI and read the resulting URL's query string. The flag --help text documents this lookup pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three independent bugs in
clis/linkedin/adapters, found while building a downstream LinkedIn skill on top of OpenCLI v1.8.3. Each commit is self-contained and can be cherry-picked individually.1.
profile-experience/profile-projects—page.waittimeout unit mismatchBoth adapters pass
timeout: 10000topage.wait({text/selector, timeout})assuming milliseconds. The implementation inbase-page.jstreatsoptions.timeoutas seconds and multiplies by 1000 — so the wait was actually 10,000 s × 1000 ms = ~2.8 hours. The 60 s command-level timeout fires first, soprofile-experiencealways errors out with `linkedin/profile-experience timed out after 60s`. `profile-projects` happens to work because "Projects" text loads fast enough — but the bug is the same.2.
connect— false-positivealready_connectedon unconnected profiles`buildProfileProbeScript` scans every visible button on the page for label "Message". The safety logic then checks `alreadyConnected` before `connectAvailable`. Sidebar widgets ("People also viewed"), premium-tier prompts, and overlay UI render "Message" elements on profiles you have no connection with — so `connect` reports `status: not_connectable, reason: already_connected` for strangers whose profiles clearly display a Connect button.
3.
thread-snapshot— sidebar conversation leak in `snapshot_json`The adapter exposes two leaks of unrelated content via the `snapshot_json` output field:
Empirical repro on a real thread: `headerNames` returned 10 names — the active thread participant plus 8 unrelated sidebar contacts. `bodyText` was ~5 KB of unrelated conversation previews.
Fix:
Test plan
All three fixes are independent. Happy to split into separate PRs if preferred, or to land any subset.
🤖 Generated with Claude Code