Skip to content

feat: major update with AI search, deep research, file uploads, chat, and model management#676

Open
SteveLauC wants to merge 265 commits into
mainfrom
rebased-chore-ai-search
Open

feat: major update with AI search, deep research, file uploads, chat, and model management#676
SteveLauC wants to merge 265 commits into
mainfrom
rebased-chore-ai-search

Conversation

@SteveLauC

@SteveLauC SteveLauC commented May 11, 2026

Copy link
Copy Markdown
Member

This is a large commit containing many significant changes:

  • Add AI search with semantic search and hybrid search support
  • Add deep research assistant type for in-depth research tasks
  • Implement file upload support for assistants
  • Add default model configuration so assistants can fall back to system defaults without explicit model assignment
  • Update model provider templates and builtin model lists with model type attribute (language, vision, and embedding)
  • Implement chat functionality, now one can talk to assistants directly in Core Server

Standards checklist

  • The PR title is descriptive
  • The commit messages are semantic
  • Necessary tests are added
  • Updated the release notes
  • Necessary documents have been added if this is a new feature
  • Performance tests checked, no obvious performance degradation

@SteveLauC SteveLauC force-pushed the rebased-chore-ai-search branch 2 times, most recently from 27d9a24 to b4a07c5 Compare May 11, 2026 02:08
SteveLauC and others added 28 commits May 26, 2026 11:00
Update the following APIs:

1. POST /settings
2. POST /setup/_initialize

Also update the definitions of data sources and assistants.

This is currently a mock implementation; the newly added fields are not
wired up yet.
…n settings

- Add LLMType (string enum) to core/llm_provider.go with constants
  LLMTypeLanguage, LLMTypeVision, LLMTypeEmbedding; apply to ModelConfig.Type
- Add modules/llm/resolve.go: ResolveModel(t, override) reads AppConfig()
  via modules/common and falls back to DefaultModel when override is unset
- Remove mandatory model_provider/model panics from New() in summary,
  extract_tags, embedding, and file_extraction processors
- Relax model_context_length validation to only trigger when explicitly set
- Resolve model at every Process() call so settings changes take effect
  immediately without pipeline restart
- Keep embedding_dimension validation (runtime setting, must be in coco.yml)
…del uses

Introduces AssistantModelUse enum (Answering, IntentAnalysis, PickingDoc,
PickingTool) in core/assistant.go and a ResolveAssistantModel helper in
modules/llm/resolve.go.

Each of the four call sites now resolves its model through the helper before
invoking the LLM. The resolution chain is:

  assistant config override
    -> settings.DefaultModel.<use-specific field>
    -> settings.DefaultModel.LanguageModel

Resolved provider/name are written back onto the assistant's ModelConfig so
that downstream readers (PromptConfig, Settings, provider lookup) keep working
without modification.

Note: PickingTool has one extra fallback layer beyond the three-tier chain
above — if nothing resolves, it falls back to the already-resolved
AnsweringModel. This preserves the previous behavior in call_mcp_tools.go
where an unspecified MCP model implicitly reused the answering model, so
existing assistants without a PickingToolModel or LanguageModel default in
settings continue to work instead of failing the MCP tool call.
Adds api.AllowOPTIONSS() to GET /attachment/:file_id and
GET /document/:doc_id/raw_content/:hint so the framework auto-registers
the matching OPTIONS route. FeatureCORS alone only wires the CORS filter
to add response headers; without the OPTIONS route the browser preflight
hits no handler and the request is blocked by CORS policy.
Replaces the single POST /setup/_initialize with two public endpoints:

  - POST /setup/_initialize — creates the admin user, populates ES
    templates (model providers, assistants, MCP servers, roles, etc.),
    forces an ES refresh on <prefix>*, then marks setup as done. The
    done flag is written last so partial failures leave the wizard
    re-runnable.
  - POST /setup/_initialize/default_model — sets the default
    language/vision/embedding models (creating providers on the fly
    when needed) and refreshes the model-provider index. Does not
    affect setup completion state and may be called repeatedly.

The UI presents initialization as two steps but the backend only
tracks completion of the first; default-model setup is an independent
operation.

Persists a single KV flag (.setup_done) replacing the prior
.setup_lock. The 'setup' AppSetting now exposes { required: bool } so
the frontend can decide whether to show the wizard.
…ions

When the wizard sends multiple selections that reuse the same builtin
model provider (matched by ModelProvider.ID) but with different
api_token values, the second update would silently overwrite the
first. Validate token consistency up front in setupInitializeDefaultModel
and return HTTP 400 so the caller can correct the input.
- Add SetupModelDef{id, support_reasoning} to SetupModelConfig.
  Users now specify either model_id (existing model on provider) or
  model (new model definition), making the two cases explicit.

- core.ModelConfig gains SupportReasoning bool / support_reasoning to
  indicate whether a language model supports reasoning mode. Distinct
  from ModelSettings.Reasoning which controls runtime use of reasoning.

- ensureSetupModelProvider now receives the LLM type (language/vision/
  embedding) so every model written to ES carries the correct Type and
  SupportReasoning value. For existing providers the model list is
  upserted in place; for custom providers it is included on create.

- Merge validateProviderTokenConsistency + validateModelSelection into
  a single validateDefaultModelConfig (5 rules, one loop):
    1. Exactly one of model_id / model per selection.
    2. model.id required when model is provided.
    3. support_reasoning rejected on non-language selections.
    4. Custom provider must use model, not model_id.
    5. Same builtin provider across selections requires identical token.
Split validation into two receivers on the core types so callers can
validate a single model or a whole provider:

- ModelConfig.ValidateModelConfig() rejects support_reasoning=true on
  non-language model types.
- ModelProvider.ValidateModels() loops over Models and returns the first
  error.

Replace the package-level helper in modules/llm with these methods. The
create and update handlers now call obj.ValidateModels() and return HTTP
400 on a bad payload.
Add a type-level doc comment explaining that ProviderID, Name, Type and
SupportReasoning are static identity/capability fields sourced from the
provider definition, while Settings, PromptConfig and Keepalive are
runtime invocation parameters. Group the fields with section comments
for readability.
Add type="language" to every model entry in builtin provider templates
and plugin.json files that previously relied on the empty default.
Models that already declared embedding or vision keep their existing
type. Also drop leftover settings blocks from provider definitions —
settings belong to the runtime model invocation, not to the provider
catalog. Where a model previously had settings.reasoning=true, promote
it to the top-level support_reasoning capability flag.
…ting

- Clear provider_id and name in the default assistant store plugin so
  they no longer hold unresolvable $[[SETUP_LLM_*]] literals that would
  be written verbatim into ES when the assistant is installed.
- Change the RegisterAppSetting key from the nested 'setup' object to
  the flat 'setup_required' bool, matching the original contract
  expected by GET /setting/application consumers.
…e dispatch

Add a new process_documents processor that routes each incoming document
through the enrichment pipeline configured on its datasource. If no
datasource-level pipeline is set, it falls back to the globally-configured
default pipeline from app settings. Documents with no pipeline configured
are passed through to the output queue unchanged.

The sub-pipeline is executed synchronously per document so that enrichment
processors (embedding, extract_tags, summary, etc.) run before the result
is written to the documents_processed queue. Because the sub-pipeline may
expand one document into multiple outputs (e.g. a splitter processor), the
full result slice is iterated and every enriched message is pushed
individually.

Wire up the processor in coco.yml inside the process_documents pipeline
stage that bridges indexing_documents → documents_processed.
…to document processors

Wire DocumentProcessing.LLMGenerationLanguage from the application config
into the three LLM-invoking document processors (summary, extract_tags,
file_extraction) as a fallback when no processor-level llm_generation_lang
is configured.

Precedence: processor config > app-level document_processing.llm_generation_language
> existing en-US hard default in ValidateAndNormalizeLLMLang().

Also adds the missing "infini.sh/coco/modules/common" import to
file_extraction/processor.go required for common.AppConfig().
Validate the incoming document_processing.llm_generation_language value
against the BCP 47 standard before merging it into the stored config.
Returns HTTP 400 with a descriptive error message when the supplied tag
cannot be parsed, preventing invalid language values from being persisted.
Add pipeline.tpl to both zh-CN and en-US setup templates so that the
pipelineconfigv2 ES index is created during POST /setup/_initialize.
Without this, searching pipeline configs before the first pipeline is
created would fail with an index_not_found_exception.

PipelineConfigV2 is defined in the framework and has no explicit
MustRegisterSchemaWithIndexName call, so its index name is derived from
the lowercased Go type name without a -v2 suffix: {prefix}pipelineconfigv2.
Field mapping is handled by the coco-search index template via dynamic mapping.
Add field mappings to the pipeline.tpl setup template so that searching
the pipelineconfigv2 index does not fail when the index is empty and
dynamic mapping has not yet been populated.

Key mapping decisions:
- processor: enabled=false to prevent field explosion from arbitrary
  nested processor configs, while keeping the raw JSON in _source
- Standard keyword/date/boolean/integer fields mapped explicitly
- labels: dynamic object for user-defined key-value pairs
SteveLauC and others added 30 commits June 12, 2026 16:11
The .public directory is generated inside the checkout folder (coco/)
by vite's outDir: '../.public' relative to web/. The cache path was
pointing to the workspace root instead, so cache restore placed the
files in the wrong location and subsequent test compilation could not
find the infini.sh/coco/.public package.
The unit_test job does not need a real frontend build — it only needs
the .public Go package to exist so that main.go compiles. Instead of
installing nodejs/pnpm and running a full vite build, generate a
minimal .public directory with a dummy file and run make update-vfs to
produce a valid static.go package stub.
pnpm 10.x introduced a breaking change requiring packageManager
dependencies in the lockfile to carry a registry path and integrity
hash. Since the lockfile was generated with pnpm 9.12.3, CI must use
the same version. Replace all 'latest' and unversioned pnpm installs
with the pinned 9.12.3 to match the packageManager field in
package.json.
Add missing documentation for the file_type_detection processor. Remove
internal Go type references (core.Document, []queue.Message, struct
field paths) from all processor docs so the documentation is oriented
toward users configuring pipelines rather than developers reading
source code.
Reverts 1c1da75. The lang parameter on the cancel API and the
CancelLang field on MessageTask coupled the backend to UI language
concerns that belong in the frontend. The cancel flow now simply
invokes CancelFunc without persisting a localized message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants