# Unit Tests for PII Detection Service

Comprehensive unit tests for all 5 service classes.

## 🧪 Test Suite

### Test Files

1. **FileHandlerTest.php** - Tests file validation and PDF conversion
2. **TextractServiceTest.php** - Tests AWS Textract integration
3. **ComprehendServiceTest.php** - Tests AWS Comprehend PII detection
4. **RegistryManagerTest.php** - Tests registry logic and PII mapping
5. **PIIDetectionServiceTest.php** - Integration tests for complete workflow

### Test Runner

**TestRunner.php** - Simple test framework that:
- Runs all test files automatically
- Provides assertion methods
- Reports pass/fail results
- Tracks statistics

## 🚀 Running Tests

### Run All Tests

```bash
cd C:\xampp\htdocs\redact
php src\classes\unit\TestRunner.php
```

### Run Individual Test

```php
require_once 'src/classes/unit/TestRunner.php';
require_once 'src/classes/unit/FileHandlerTest.php';

$runner = new TestRunner();
$test = new FileHandlerTest();
$test->setRunner($runner);
$test->runTests();
```

## 📋 Test Coverage

### FileHandlerTest (6 tests)
- ✅ File validation for valid uploads
- ✅ File info extraction
- ✅ PDF to image conversion
- ✅ Image file validation
- ✅ Large file rejection
- ✅ Invalid file type rejection

### TextractServiceTest (4 tests)
- ✅ Service initialization
- ✅ Document analysis
- ✅ Block extraction (WORD, LINE, LAYOUT)
- ✅ Layout type detection

### ComprehendServiceTest (5 tests)
- ✅ Service initialization
- ✅ General PII detection
- ✅ Name detection
- ✅ Bank details detection (with context)
- ✅ Multiple PII types detection

### RegistryManagerTest (8 tests)
- ✅ Registry initialization
- ✅ Build registries from blocks
- ✅ Layout extraction
- ✅ Word registry building
- ✅ Determine layouts to process
- ✅ PII mapping to word blocks
- ✅ Apply PII to pages
- ✅ Statistics calculation

### PIIDetectionServiceTest (5 tests)
- ✅ Service initialization
- ✅ Process document from file path
- ✅ Process uploaded document
- ✅ Result structure validation
- ✅ End-to-end PII detection

**Total: 28 unit tests**

## 📊 Sample Test Output

```
╔════════════════════════════════════════════════════════════════╗
║              PII DETECTION SERVICE - UNIT TESTS                 ║
╚════════════════════════════════════════════════════════════════╝

📋 Running: FileHandlerTest
----------------------------------------------------------------------
  ✅ Valid PDF file should pass validation
  ✅ File extension should be pdf
  ✅ Should recognize file as PDF
  ✅ Should not recognize PDF as image
  ✅ PDF conversion should succeed
  ✅ Should return images array

📋 Running: TextractServiceTest
----------------------------------------------------------------------
  ✅ TextractService should initialize
  🔄 Analyzing document with Textract...
  ✅ Textract analyze should succeed
  ✅ Should return data
  ✅ Should extract blocks

📋 Running: PIIDetectionServiceTest
----------------------------------------------------------------------
  ✅ PIIDetectionService should initialize
  🔄 Processing document (this may take 30-60 seconds)...
  ✅ Document processing should succeed
  ✅ Should include processing_time
  ✅ Should detect layouts
  ✅ Should find PII instances
  
  📊 PII Detection Results:
     - Unique words: 523
     - PII words: 18
     - Total PII instances: 156
     - Comprehend calls: 65
     - Optimization: 0%

======================================================================
SUMMARY
======================================================================
Total Tests:  28
✅ Passed:     28
❌ Failed:     0
Duration:     45.32s

🎉 ALL TESTS PASSED!
```

## 🎯 Test Sample

The test suite uses **`BeytekinS Payslips.pdf`** located in this directory as the sample document for integration testing.

### Expected PII in Sample:
- Names (employee names, manager names)
- Addresses
- Account numbers
- Dates
- Possibly phone numbers or emails

## ⚠️ Requirements

### To Run All Tests:
- ✅ PHP 8.2+
- ✅ Imagick extension (for PDF conversion)
- ✅ Ghostscript (for Imagick PDF support)
- ✅ AWS credentials configured in `src/config/config.php`
- ✅ Sample PDF file in `src/classes/unit/`

### Tests Will Skip If:
- ⏭️ Imagick not installed → PDF conversion tests skipped
- ⏭️ AWS credentials not configured → API tests skipped
- ⏭️ Sample PDF not found → integration tests skipped

## 🔧 Assertions Available

```php
$runner->assert(bool $condition, string $testName)
$runner->assertEquals($expected, $actual, string $testName)
$runner->assertTrue($value, string $testName)
$runner->assertFalse($value, string $testName)
$runner->assertNotNull($value, string $testName)
$runner->assertGreaterThan($threshold, $value, string $testName)
```

## 📝 Writing New Tests

### Template:

```php
<?php
class MyNewTest extends BaseTest
{
    public function runTests(): void
    {
        $this->testSomething();
        $this->testSomethingElse();
    }
    
    private function testSomething(): void
    {
        // Arrange
        $service = new MyService();
        
        // Act
        $result = $service->doSomething();
        
        // Assert
        $this->runner->assertTrue($result['success'], 'Operation should succeed');
    }
}
```

## 🎓 Best Practices

1. **Arrange-Act-Assert** pattern for clarity
2. **Skip gracefully** if dependencies unavailable
3. **Informative test names** describing expected behavior
4. **Test both success and failure** cases
5. **Use sample data** representative of real usage
6. **Log progress** for long-running tests
7. **Check structure** of complex results

## 🐛 Debugging Failed Tests

### Common Issues:

**"Imagick not installed"**
- Install Imagick PHP extension
- Restart Apache

**"AWS credentials not configured"**
- Check `src/config/config.php`
- Ensure valid AWS keys

**"Sample PDF not found"**
- Place test PDF in `src/classes/unit/`
- Check filename matches exactly

**"Timeout"**
- Increase `max_execution_time` in `php.ini`
- Or add `ini_set('max_execution_time', 300)` in tests

## 📈 Future Enhancements

- [ ] Add PHPUnit integration
- [ ] Add code coverage reports
- [ ] Add performance benchmarks
- [ ] Add mock AWS responses for offline testing
- [ ] Add stress tests for large documents
- [ ] Add parallel test execution

---

**Run tests regularly to ensure code quality!** 🧪✨

