Building a Desktop Translation Tool with Google and Baidu APIs

Architecture

Since the application supports multiple translation backends, it's beneficial to extract common functionality into a separate enterface. This approach also facilitates future extensions with additional translation providers. The design follows a simple abstraction layer with a ITranslationProvider interface, implemented by concrete translation adapters like GoogleAdapter and BaiduAdapter. The UI layer depends only on this interface, not on specific implementations—this represents a straightforward application of the dependency inversion principle.

The interface definition:

/// <summary>
/// Translation provider interface - all translators must implement this contract
/// </summary>
public interface ITranslationProvider
{
    /// <summary>
    /// Translates text from source language to destination language
    /// </summary>
    string Translate(string sourceText, string sourceLanguage, string targetLanguage);

    /// <summary>
    /// URL for text-to-speech audio playback
    /// </summary>
    Uri AudioSource { get; }

    /// <summary>
    /// Collection of all supported language codes
    /// </summary>
    IReadOnlyList<string> SupportedLanguages { get; }

    /// <summary>
    /// Time taken for the translation operation
    /// </summary>
    TimeSpan ExecutionTime { get; }
}

Both GoogleAdapter and BaiduAdapter implement this interface independently.

Implementation Details

The usage pattern is straightforward. Initialize the desired translation provider through the interface:

ITranslationProvider provider = new GoogleAdapter();

// Perform translation
string result = provider.Translate("I am an earthling", "en", "fr");

// Retrieve speech synthesis URL
Uri audioUrl = provider.AudioSource;

// Get execution duration in milliseconds
int elapsedMs = (int)provider.ExecutionTime.TotalMilliseconds;

// Switch to alternative provider
provider = new BaiduAdapter();


// Translate using different backend
result = provider.Translate("I am an earthling", "en", "de");

// Retrieve speech synthesis URL  
audioUrl = provider.AudioSource;

elapsedMs = (int)provider.ExecutionTime.TotalMilliseconds;

Both adapter classes handle HTTP requests to their respective APIs. The implementation uses HttpClient for communicating with translation services and parses JSON responses from the server.

Limitations

Several issues exist in the current implementation that should be addressed in production code:

The text-to-speech functionality relies on exterrnal TTS endpoints. When the input text exceeds certain length limits, the official web interfaces may split audio requests into multiple chunks, but the current implementation does not handle this splitting, which may result in truncated audio output.

Audio playback is handled through a WebBrowser control in some environments. This approach may not work reliably across all systems, and alternatively, invoking the system's media player directly would provide more consistent results.

The JSON parsing logic is minimal and permissive. For robustness, a dedicated JSON library like Newtonsoft.Json or System.Text.Json should be used to properly handle edge cases in API responses.

Network requests may fail silently without proper timeout handling or retry logic. Adding appropriate error handling and logging would improve reliability.

Tags: C# Translation API Desktop Application Google API Baidu API

Posted on Sat, 09 May 2026 15:47:25 +0000 by crochk