Skip to main content

Domain Specialization (LoRA + RAG)

Provides a 3-stage strategy for optimizing general-purpose LLMs for specific domains such as finance, telecommunications, and manufacturing to dramatically improve coding quality.

Core Question

"Why doesn't code generated by Claude or GPT follow our company standards?" → Because the model hasn't learned your domain knowledge.


3-Layer Strategy

Domain specialization is applied progressively: Steering → RAG → LoRA.

Layer 1: Steering (Immediate)

Definition: Explicitly define coding rules in spec files to instruct the LLM.

Pros:

  • Immediately applicable
  • Zero cost
  • Easy maintenance (just edit spec files)

Cons:

  • Limited for complex domain logic
  • Context window waste

Example:

# coding-standards.md

## Coding Conventions
- Class names: PascalCase
- Method names: camelCase
- Constants: UPPER_SNAKE_CASE

## Transaction Handling
- All DB operations must use @Transactional
- Rollback condition: on RuntimeException

## Logging Standards
- Entry point: log.info("Method {} started", methodName)
- Exceptions: log.error("Error in {}: {}", methodName, e.getMessage())

Layer 2: RAG (1-2 weeks)

Definition: Embed internal documents in a vector DB for real-time retrieval, including relevant information in prompts.

Pros:

  • Auto-reflects latest documents (no retraining)
  • High accuracy for internal API specs
  • No model weight changes

Cons:

  • Infrastructure required (Milvus, Neo4j)
  • Retrieval quality directly impacts output quality
  • Embedding costs

Example:

from langchain.vectorstores import Milvus
from langchain.embeddings import OpenAIEmbeddings

# 1. Embed internal API documentation
embeddings = OpenAIEmbeddings()
vectorstore = Milvus.from_documents(
documents=internal_api_docs,
embedding=embeddings,
connection_args={"host": "milvus.cluster.local", "port": 19530}
)

# 2. Search for relevant documents
query = "How to call user authentication API?"
docs = vectorstore.similarity_search(query, k=3)

# 3. Pass search results + query to LLM
prompt = f"Context: {docs}\n\nQuestion: {query}"

Layer 3: LoRA (1-2 months)

Definition: Adjust model weights with domain data to generate domain expert-level output.

Pros: Consistent code style, highest domain terminology accuracy, complex pattern learning Cons: GPU training cost ($2,000), training data collection required

Kiro GLM-5 vs Self-Hosting

Kiro IDE has natively supported GLM-5 since April 2026 and is immediately available. However, LoRA Fine-tuning, Multi-LoRA hot-swap for multiple customers, and self-controlled compliance are only possible with self-hosting. Recommendation: Kiro for prototyping, self-hosting for production domain specialization

For detailed LoRA training and deployment pipeline implementation, see Custom Model Pipeline — LoRA Training & Deployment Pipeline (Domain Specialization). Includes QLoRA GPU optimization, training data format, NeMo/Unsloth frameworks, checkpoint management, and Multi-LoRA hot-swap deployment configuration.


Required Layers by Scenario

RequirementLayer 1 (Steering)Layer 2 (RAG)Layer 3 (LoRA)Recommended Combination
Coding conventions✅ Sufficient△ Excessive❌ UnnecessaryLayer 1
Internal API usage△ Insufficient✅ Required❌ UnnecessaryLayer 1 + 2
Domain terminology❌ Limited△ Supplementary✅ RequiredLayer 2 + 3
SOC2 procedures✅ Playbook sufficient❌ Unnecessary❌ UnnecessaryLayer 1
Consistent code style△ Basic only△ Supplementary✅ Most effectiveLayer 1 + 3
Legacy migration patterns❌ Impossible△ Example provision✅ CoreLayer 2 + 3
Cost vs Effect
  • Layer 1 only: Free, 60% improvement
  • Layer 1 + 2: Infrastructure cost, 80% improvement
  • Layer 1 + 2 + 3: $2,000, 95% improvement

VectorRAG Configuration

VectorRAG is a document retrieval-based domain specialization approach.

Architecture

Knowledge Feature Store Integration

Integrates with Layer 5: Knowledge Feature Store of the LG U+ Agentic AI Platform for vector search.

apiVersion: feast.dev/v1alpha1
kind: FeatureStore
metadata:
name: knowledge-feature-store
spec:
online_store:
type: milvus
connection:
host: milvus.cluster.local
port: 19530
entities:
- name: api_doc
value_type: STRING
features:
- name: api_embedding
dtype: FLOAT_LIST
dimensions: 1536 # OpenAI ada-002

Data Flow

  1. Document Collection: Confluence, GitHub, Wiki → Crawling
  2. Chunk Splitting: Split into 512-token chunks (50-token overlap)
  3. Embedding: OpenAI text-embedding-3-large or BGE-M3
  4. Vector Storage: Store in Milvus collection
  5. Search: Question embedding → Cosine similarity Top-K
  6. LLM Delivery: Search results + question → LLM
Chunk Size Optimization
  • Too small: Context loss
  • Too large: Noise increase
  • Recommended: 512 tokens, 50-token overlap

GraphRAG Configuration

GraphRAG is a knowledge graph-based domain specialization approach. It explicitly models relationships between financial business terminology and regulations.

Architecture

Ontology-Based Structure

Defines entities, relations, and attributes in the financial domain.

// Entity definitions
CREATE (loan:Product {name: "Mortgage Loan", type: "Loan"})
CREATE (credit:Criteria {name: "Credit Score", threshold: 600})
CREATE (reg:Regulation {code: "Banking Supervision Regulation Article 35"})

// Relationship definitions
CREATE (loan)-[:REQUIRES]->(credit)
CREATE (loan)-[:GOVERNED_BY]->(reg)
CREATE (credit)-[:VERIFIED_BY]->(cbService:Service {name: "CB Inquiry"})

VectorRAG + GraphRAG Hybrid

Advantages:

  • VectorRAG: Reflects latest documents
  • GraphRAG: Complex rule reasoning
  • Hybrid: Accuracy + Flexibility
Production Example

Question: "Can a customer with credit score 550 get a mortgage loan?"

  1. VectorRAG: Search "mortgage loan" documents → "Credit score 600+ required"
  2. GraphRAG: Traverse (loan)-[:REQUIRES]->(credit {threshold: 600})
  3. LLM Judgment: "550 < 600 → Not eligible" + "Credit score improvement guidance"

FSI SI Production Scenarios

Scenario 1: COBOL → Java Legacy Migration

Effect Comparison by Layer

ApproachAccuracyConsistencyCostNotes
Steering only60%LowFreeSyntax correct but financial logic errors
+ RAG80%MediumInfrastructureImproved accuracy, inconsistent patterns
+ LoRA95%High$2,000Consistent patterns + financial logic

ROI Analysis

Assumptions:

  • 10,000 modules to migrate
  • Developer hourly rate: $50/hr
MethodTime/ModuleTotal TimeTotal CostNotes
Manual2 hours20,000 hrs$1,000,000-
LLM (Steering+RAG)1 hour10,000 hrs$500,000Savings: $500,000
LLM (+ LoRA)30 min5,000 hrs$250,000 + $2,000Savings: $748,000

ROI:

  • LoRA training cost: $2,000
  • Cost savings: $748,000
  • ROI: 374x
Production Example

Input (COBOL):

PERFORM CALC-INTEREST
USING WS-PRINCIPAL WS-RATE
GIVING WS-INTEREST.
IF WS-CREDIT-SCORE < 600
MOVE 'REJECT' TO WS-RESULT
ELSE
MOVE 'APPROVE' TO WS-RESULT.

Output (Java, after LoRA training):

@Service
@Transactional
public class LoanService {

@AuditLog(regulation = "Banking Supervision Regulation Article 35")
public LoanDecision processLoan(BigDecimal principal, BigDecimal rate, int creditScore) {
BigDecimal interest = calcInterest(principal, rate);

if (creditScore < 600) {
return LoanDecision.REJECT;
}
return LoanDecision.APPROVE;
}

private BigDecimal calcInterest(BigDecimal principal, BigDecimal rate) {
return principal.multiply(rate).setScale(2, RoundingMode.HALF_UP);
}
}

Scenario 2: Internal Framework Code Generation

In SI environments using proprietary frameworks (Samsung SDS Devon, LG CNS Anyframe, etc.), general-purpose LLMs cannot generate accurate code.

Solution

  1. LoRA learns framework patterns

    {"input": "Create user lookup API", "output": "@DevonController\npublic class UserController extends AbstractController {\n    @DevonService\n    private UserService userService;\n    ..."}
  2. RAG searches framework API documentation

    # Embed Devon API documentation
    docs = ["DevonController usage", "DevonService transaction handling", ...]
    vectorstore.add_documents(docs)
  3. Steering enforces conventions

    - All Controllers must extend AbstractController
    - Services must use @DevonService annotation

Effect

  • Internal framework code generation accuracy: 95%
  • Junior developer onboarding time: 3 months → 1 month

Scenario 3: Regulatory Compliance Code Auto-Generation

Automatically reflects financial regulations (Electronic Financial Supervisory Regulation (전자금융감독규정), Banking Supervision Regulation (은행업감독규정)) into code.

Training Data Example

{"input": "Loan approval API", "output": "@AuditLog(regulation = \"Banking Supervision Regulation Article 35\")\n@AccessControl(level = AccessLevel.CRITICAL)\npublic TransferResult executeTransfer(TransferRequest req) {\n    validateTransactionLimit(req); // Article 34\n    fdsService.checkAnomalySync(req); // FDS integration\n    ...\n}"}

Auto-Generated Result

@RestController
@RequestMapping("/api/loan")
public class LoanController {

@AuditLog(regulation = "Banking Supervision Regulation Article 35")
@AccessControl(level = AccessLevel.CRITICAL)
@PostMapping("/approve")
public LoanResponse approveLoan(@RequestBody LoanRequest req) {
// Article 34: Transaction limit validation
validateTransactionLimit(req);

// FDS anomaly detection (Article 15)
if (fdsService.detectAnomaly(req)) {
throw new FraudException("Anomaly detected");
}

// Identity verification (Article 17)
if (!authService.verifyIdentity(req.getSsn())) {
throw new AuthException("Identity verification failed");
}

return loanService.approve(req);
}
}
Regulatory Change Response

When regulations change:

  1. Update training data
  2. Retrain LoRA (2-3 days)
  3. Auto-scan existing code → Detect violations

Scenario 4: Multi-Customer Operations

When an SI company operates multiple customers on the same platform, per-customer LoRA adapters are hot-swapped.

Per-Customer Configuration

CustomerDomainBase ModelLoRARAG
Bank ACore bankingGLM-5-32BBank-CoreBank-API
Securities BOrder executionGLM-5-32BSecurities-OrderSecurities-API
Insurance CContract managementGLM-5-32BInsurance-ContractInsurance-API

For Multi-LoRA deployment and per-customer routing implementation, see Custom Model Pipeline — LoRA Training & Deployment Pipeline.


Evaluation Pipeline

Continuously validate the quality of domain-specialized models. Follow these evaluation methods and baselines:


Phase-by-Phase Adoption Roadmap

PhaseDurationConfigurationEffectCost
1ImmediateSteering + PlaybookCompliance + Basic qualityFree
21-2 weeks+ VectorRAG (Milvus)Internal knowledge accuracy improvementInfrastructure
32-4 weeks+ SLM CascadeCost optimization (70% savings)+$500/month
41-2 months+ LoRA Fine-tuningDomain expertise + Style consistencyGPU $2K

For detailed per-phase implementation guides, see the Custom Model Pipeline Guide.


References

Official Documentation