Initial Approach with Fine-Tuning
Attempted fine-tuning using internal nursing knowledge documents to create a specialized model. The process involved:
- Segmenting Word documents into logical chunks
- Generating Q&A pairs using text-davinci-003
- Training a custom model with OpenAI's fine-tuning API
Document Segmentation Code
import docx
import pandas as pd
def extract_document_sections(file_path):
document = docx.Document(file_path)
sections = []
current_section = {"heading": "", "text": ""}
for para in document.paragraphs:
if para.style.name == 'Heading 1':
if current_section["text"]:
sections.append(current_section.copy())
current_section = {"heading": para.text, "text": ""}
else:
current_section["text"] += para.text + "\n"
if current_section["text"]:
sections.append(current_section)
return sections
if __name__ == '__main__':
sections = extract_document_sections('nursing_guide.docx')
pd.DataFrame(sections).to_csv('document_sections.csv', index=False)
Q&A Generation Challenges
Automated question-answer generation produced inconsistent quality. The fine-tuned model performed poorly due to:
- Insufficient training samples
- Suboptimal prompt-completion formatting
- Limited domain-specific context understanding
Embedding-Based Solution
Implemented an alternative approach using text embeddings:
- Chunk knowledge documents into contextual segments
- Generate embeddings for each segment
- For user queries: calculate query embeddding → find most relevant segments → inject context into GPT prompt
Implementation with llama_index
from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader
from langchain import OpenAI
import os
os.environ["OPENAI_API_KEY"] = 'API_KEY'
def build_knowledge_index(docs_dir):
llm_predictor = OpenAI(temperature=0.7, max_tokens=512)
documents = SimpleDirectoryReader(docs_dir).load_data()
return GPTSimpleVectorIndex.from_documents(documents)
def query_index(question, index):
return index.query(question, response_mode="compact").response
if __name__ == '__main__':
knowledge_index = build_knowledge_index("nursing_docs")
knowledge_index.save_to_disk('nursing_index.json')
print(query_index("Postpartum belly band usage timing?", knowledge_index))
Deployment Considerations
Key deployment challenges and solutions:
- Resolved Python module dependencies (_bz2 compatibility)
- Configured network binding for web access (server_name='0.0.0.0')
- Managed token limits for context injection
Performance and Optimization
Each query costs approximately $0.10. Optimization strategies include:
- Balancing context chunk size for relevance and cost
- Exploring hybrid fine-tuning/embedding approaches
- Implementing caching for frequent queries