fix(linkedin): three independent adapter bugs (profile-experience timeout, connect false-positive, thread-snapshot leak) by archits01 · Pull Request #1913 · jackwener/OpenCLI

archits01 · 2026-06-10T13:03:09Z

Summary

Three independent bugs in clis/linkedin/ adapters, found while building a downstream LinkedIn skill on top of OpenCLI v1.8.3. Each commit is self-contained and can be cherry-picked individually.

1. `profile-experience` / `profile-projects` — `page.wait` timeout unit mismatch

Both adapters pass timeout: 10000 to page.wait({text/selector, timeout}) assuming milliseconds. The implementation in base-page.js treats options.timeout as seconds and multiplies by 1000 — so the wait was actually 10,000 s × 1000 ms = ~2.8 hours. The 60 s command-level timeout fires first, so profile-experience always errors out with `linkedin/profile-experience timed out after 60s`. `profile-projects` happens to work because "Projects" text loads fast enough — but the bug is the same.

Repro: `opencli linkedin profile-experience --profile-url https://www.linkedin.com/in// --site-session persistent` against any valid profile → consistently times out at 60 s.
Trace shows pre_navigate completes, then the adapter hangs at the `page.wait({ text: 'Experience', timeout: 10000 })` call with no further actions.
Fix: `10000 → 10` at 4 call sites (profile-experience.js:499/503/643, profile-projects.js:288). Matches how `search.js` already calls the same API (`timeout: 8`, `timeout: 10`).

2. `connect` — false-positive `already_connected` on unconnected profiles

`buildProfileProbeScript` scans every visible button on the page for label "Message". The safety logic then checks `alreadyConnected` before `connectAvailable`. Sidebar widgets ("People also viewed"), premium-tier prompts, and overlay UI render "Message" elements on profiles you have no connection with — so `connect` reports `status: not_connectable, reason: already_connected` for strangers whose profiles clearly display a Connect button.

Repro: pick any profile of someone you are not connected to. `opencli linkedin connect '' --expected-name ''` returns `already_connected` even though the page has a visible Connect button.
Empirically reproduced on multiple unconnected profiles.
Fix: gate `alreadyConnected` on `!connectAvailable` at the safety check (connect.js:119) — the explicit Connect button is the authoritative tell. Smallest possible diff (one line). Alternative cleaner fix would scope the button-label scan to the top profile card.

3. `thread-snapshot` — sidebar conversation leak in `snapshot_json`

The adapter exposes two leaks of unrelated content via the `snapshot_json` output field:

`bodyText: refreshedText` in the return object dumps the entire `document.body.innerText` — which on the messaging page includes the sidebar conversation list (other people's thread previews), "Sponsored" InMails, and navigation chrome.
`headerSelectors` includes nine selectors, six of which are unscoped (`main h1`, `main h2`, `[data-anonymize="person-name"]`, `a[href*="/in/"]`, `.msg-conversation-card__participant-names`). On the messaging page these match the sidebar's conversation cards — so `headerNames` returns every contact in the user's recent inbox.

Empirical repro on a real thread: `headerNames` returned 10 names — the active thread participant plus 8 unrelated sidebar contacts. `bodyText` was ~5 KB of unrelated conversation previews.

Fix:

Remove `bodyText: refreshedText,` from the return. The internal `refreshedText` variable is still used to compute the `latestMessageText` fallback inside the IIFE; only the externally-exposed output drops it.
Reduce `headerSelectors` to three thread-scoped selectors: `.msg-thread__link-to-profile`, `.msg-thread__link-to-profile span[aria-hidden="true"]`, and `.msg-thread .msg-entity-lockup__entity-title`. Recipient extraction still works because the consumer reads `headerNames[0]`, which the first selector still provides.
No test changes — existing `thread-snapshot.test.js` does not assert on `bodyText` or unrelated `headerNames`.

Test plan

`profile-experience --profile-url ` returns 7 entries with title/company/dates/location/description in ~25 s (was timing out at 60 s).
`profile-projects --profile-url ` still returns the projects list (regression check).
`connect --profile-url --expected-name ` now returns `connectable: true` (was `already_connected`). With `--send true`, real invite delivered and `delivery_verified: true`.
`connect` on a profile you ARE connected to still correctly returns `already_connected` (no regression).
`thread-snapshot --thread-url ` returns `headerNames: [""]` only (was returning 10+ unrelated names). `messages` array still complete (41 messages on the test thread). `snapshot_json` no longer contains any unrelated conversation text.

All three fixes are independent. Happy to split into separate PRs if preferred, or to land any subset.

🤖 Generated with Claude Code

profile-experience.js and profile-projects.js passed `timeout: 10000` to page.wait({text/selector, timeout}) assuming milliseconds. The page.wait() implementation in base-page.js treats `options.timeout` as seconds and multiplies by 1000, so the adapters were actually waiting 10,000,000ms (~2.8 hours) instead of 10s. In practice the command-level 60s timeout fires first, so profile-experience always failed with "linkedin/profile-experience timed out after 60s". profile-projects happened to work because "Projects" text rendered fast enough that the wait resolved before the 60s cap — but the bug was still present. Fix: change `timeout: 10000` → `timeout: 10` (4 occurrences across the two files), bringing them in line with how search.js uses the same API (`timeout: 8`, `timeout: 10`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…nect button is present buildProfileProbeScript scans ALL visible buttons on the profile page for "Message" text (line 143) and ALL connect anchors (line 145), then the safety logic checks alreadyConnected BEFORE connectAvailable. On LinkedIn profiles, a stray "Message" element in the sidebar, "People also viewed" cards, or premium-tier overlays trips the alreadyConnected flag even when the main Connect button is visible. Result: `connect` reports `status: not_connectable, reason: already_connected` for profiles you've never connected with. Repro: any profile of someone you're not connected to who has any sidebar widget rendering a "Message" button. Empirically reproduced on multiple unconnected profiles (Marie Curie, Sundar Pichai namesakes). Fix: gate the alreadyConnected check on !connectAvailable. If the profile clearly has a Connect button, prefer that signal. The Connect button is the authoritative tell — if it's there, you can connect; if it's not AND there's a Message button, then you're connected. Minimal-diff approach: single line change at the safety check. Alternative cleaner fix would restructure the probe to scope the button scan to the top profile card, but this matches the existing code shape with smallest risk surface. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…debar leak The thread-snapshot adapter exposed two leaks of unrelated content via the `snapshot_json` field: 1. `bodyText: refreshedText` (return object) dumped the entire `document.body.innerText`, including the messaging sidebar's conversation list with previews of other people's threads, "Sponsored" InMails, and the navigation chrome. Anyone consuming snapshot_json got every conversation preview the user could see. 2. `headerSelectors` collected names from 9 selectors including unscoped `main h1`, `main h2`, `[data-anonymize="person-name"]`, `a[href*="/in/"]` and `.msg-conversation-card__participant-names` — all of which match sidebar conversation cards. Result: `headerNames` returned every contact in the user's recent inbox. Empirical repro on a real thread: `headerNames` contained 10 names including the active thread participant AND 8 unrelated sidebar contacts. The internal `bodyText` was ~5KB of unrelated conversation previews + nav chrome. Fix: - Remove `bodyText: refreshedText,` from the return object. The internal `refreshedText` variable is still used to compute the `latestMessageText` fallback inside the IIFE — that logic is unchanged. Only the externally-exposed output drops it. - Reduce `headerSelectors` to 3 thread-scoped selectors: `.msg-thread__link-to-profile`, `.msg-thread__link-to-profile span[aria-hidden="true"]`, and `.msg-thread .msg-entity-lockup__entity-title` (the .msg-thread prefix gates it to the active conversation only). Recipient extraction (consumer reads headerNames[0]) continues to work because `.msg-thread__link-to-profile` is the canonical thread header link. After the fix: `headerNames` contains only the active thread participant. `messages` array is unchanged (its selectors were already scoped to `.msg-s-message-list__event` etc.). No test changes — the existing thread-snapshot.test.js does not assert on bodyText or on unrelated headerNames. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The cards selector at line 26 scans for `li, div, section, article` elements whose text contains "withdraw". This matches not only the real invitation cards but also their wrapping section element — the wrapper's innerText includes the page header counter (e.g. "People (81)") at the top of the list. The fallback name-extraction at line 39 (lines.find skipping a few common keywords) then picks "People (81)" as the row's name. Result: rank 1 of sent-invitations output is a phantom row with `name: "People (81)"` and the profile_url of whatever real invite happens to be the first descendant link. Empirical repro: rank 1 returned `name: "People (81)", profile_url: <real-marie-curie-url>`, and rank 2 returned the real `name: "Marie Curie", profile_url: <same-url>` — the actual invitation card. Fix: drop rows whose parsed name matches header-counter shapes: `/^people\s*$\d+$$/i` and `/^\d+\s+invitations?$/i`. Minimal diff (3 lines), keeps the rest of the parsing path intact. A cleaner fix would scope the cards selector to a known invitation-row class instead of `li, div, section, article` — but the header rows already provide a stable signature, and this matches the existing defensive style elsewhere in the script. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…olution needed) LinkedIn's standard people-search page (/search/results/people/) accepts several filter knobs as URL query parameters whose values are JSON-stringified. Adapter previously only built `?keywords=<encoded>`, matching opencli's existing CLI surface of "keywords + limit". The underlying LinkedIn UI exposes far more. This commit adds 7 new flags for the filters that don't require any ID/URN resolution against LinkedIn's typeahead — they accept free text or a small enum and get passed straight through: --first-name <text> -> firstName="<text>" --last-name <text> -> lastName="<text>" --title <text> -> title="<text>" --school-keyword <text> -> schoolFreetext="<text>" --network 1,2,3 -> network=["F","S","O"] (1=1st, 2=2nd, 3=3rd+; CSV maps to F/S/O) --profile-language en,fr -> profileLanguage=["en","fr"] (ISO 639-1 2-letter codes, validated) --open-to proBono,boardMember -> serviceCategory=["proBono","boardMember"] (validated against LinkedIn's enum) `buildSearchUrl` now takes an options object instead of a bare keywords string. The function preserves byte-for-byte compatibility for the zero-filter case (still emits `?keywords=<encoded>`), and accepts a bare string as input for back-compat with the existing test (which calls `buildSearchUrl('site reliability engineer')` directly). Filter values get JSON-stringified then percent-encoded — matching how LinkedIn's UI re-canonicalises the URL when you click filter chips. Empirically verified (2026-06-10): the page accepts these param names on its server-rendered search results route, applies the filters, and shows the same `1st 2nd 3rd+` chips it would for a UI-driven filter. A follow-up commit will add ID-resolution filters (current-company, past-company, industry, location, school by name) by calling LinkedIn's typeahead Voyager endpoint. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…r yet) Adds direct ID-acceptance for LinkedIn's structured filters that normally require a typeahead lookup to resolve a name to a URN/ID: --current-company-id <id1,id2,...> -> currentCompany=["<id>",...] --past-company-id <id1,id2,...> -> pastCompany=["<id>",...] --industry-id <id1,id2,...> -> industry=["<id>",...] --location-id <id1,id2,...> -> geoUrn=["<id>",...] --school-id <id1,id2,...> -> schoolFilter=["<id>",...] Empirically verified (2026-06-10) that LinkedIn's /search/results/people/ route accepts these param values as JSON-stringified arrays of bare numeric IDs — the full `urn:li:fs_<type>:<id>` prefix is not required. Confirmed by: ?currentCompany=["1441"] -> filtered to Google employees ?geoUrn=["105214831"] -> filtered to Bengaluru-located profiles Each flag accepts either a bare numeric ID or a full URN; URNs are stripped to their numeric tail before sending. This is a "v2-light" implementation. A future commit could add a typeahead-based name resolver (`--current-company "Google"` -> resolve to "1441"), but LinkedIn's classic Voyager typeahead endpoint (`/voyager/api/typeahead/hits`) was removed/moved in 2024 to a GraphQL endpoint with rotating queryId hashes. Implementing that cleanly is a separate, larger lift; the ID-direct flags ship now so the filter surface is usable today. To find an ID: visit the relevant page in LinkedIn UI (e.g. a company's `/company/<slug>/` page), open the network panel, look for the `fsd_company` URN in the response; or apply the filter once via the search results UI and read the resulting URL's query string. The flag --help text documents this lookup pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

archits01 and others added 6 commits June 10, 2026 18:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(linkedin): three independent adapter bugs (profile-experience timeout, connect false-positive, thread-snapshot leak)#1913

fix(linkedin): three independent adapter bugs (profile-experience timeout, connect false-positive, thread-snapshot leak)#1913
archits01 wants to merge 6 commits into
jackwener:mainfrom
archits01:openslide-vendor-patches

archits01 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

archits01 commented Jun 10, 2026

Summary

1. profile-experience / profile-projects — page.wait timeout unit mismatch

2. connect — false-positive already_connected on unconnected profiles

3. thread-snapshot — sidebar conversation leak in `snapshot_json`

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `profile-experience` / `profile-projects` — `page.wait` timeout unit mismatch

2. `connect` — false-positive `already_connected` on unconnected profiles

3. `thread-snapshot` — sidebar conversation leak in `snapshot_json`