[Bug]: Rate limiter decrements by incorrect token count for /v1/responses endpoint

### What happened?

## Bug: Rate limiter decrements by incorrect token count for `/v1/responses` endpoint

### Description

When using the `/v1/responses` endpoint with team-per-model TPM rate limiting enabled, the `x-ratelimit-model_per_team-remaining-tokens` header decreases by only ~2 tokens per request, regardless of actual token consumption reported in `total_tokens`.

### Environment

- **Affected**: Original LiteLLM
- **Not affected**: Stably fork

### Steps to Reproduce

Run the following curl command multiple times:

```bash
  curl -sD - -o - <LITELLM_PROXY_URL>/v1/responses \
    -H "Authorization: Bearer <API_KEY>" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "gemini-3-flash-preview",
      "input": "hello"
    }' \
  | grep -oE '"total_tokens"[[:space:]]*:[[:space:]]*[0-9]+|x-ratelimit-model_per_team-remaining-tokens:[[:space:]]*[0-9]+'
```

### Expected Behavior

`x-ratelimit-model_per_team-remaining-tokens` should decrease by the value reported in total_tokens (~35 tokens per request).

Actual Behavior

  | Request | remaining-tokens | total_tokens | Actual Decrease |
  |---------|------------------|--------------|-----------------|
  | 1       | 1,999,998        | 35           | -               |
  | 2       | 1,999,996        | 35           | 2               |
  | 3       | 1,999,994        | 35           | 2               |

The rate limiter only decrements by 2 tokens instead of the actual 35 tokens consumed.

### Impact

  - Rate limiting is ineffective for /v1/responses endpoint
  - Users can consume significantly more tokens than their rate limit should allow
  - TPM quotas are not being enforced correctly

### Additional Context

  - This issue is specific to the v1/responses endpoint with model_per_team TPM rate limiting
  - Requires team-per-model TPM rate limiting to be configured

### Relevant log output

```shell
=== Request 1 ===                                                                                                                
  x-ratelimit-model_per_team-remaining-tokens: 1999998
  "total_tokens":104

  === Request 2 ===
  x-ratelimit-model_per_team-remaining-tokens: 1999996
  "total_tokens":104

  === Request 3 ===
  x-ratelimit-model_per_team-remaining-tokens: 1999994
  "total_tokens":104
```

### What part of LiteLLM is this about?

SDK (litellm Python package)

### What LiteLLM version are you on ?

v1.80.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Rate limiter decrements by incorrect token count for /v1/responses endpoint #18671

What happened?